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Abstract 

We  set  out  a  modal  logic  for  reasoning  about  multilevel  security  of  probabilistic  systems.  This 
logic  contains  expressions  for  time,  probability,  and  knowledge.  Making  use  of  the  Halpern- 
Tuttle  framework  for  reasoning  about  knowledge  and  probability,  we  give  a  semantics  for  our 
logic  and  prove  it  is  sound.  We  give  two  syntactic  definitions  of  perfect  multilevel  security  and 
show  that  their  semantic  interpretations  are  equivalent  to  earlier,  independently  motivated 
characterizations.  We  also  discuss  the  relation  between  these  characterizations  of  security 
and  between  their  usefulness  in  security  analysis. 
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Figure  1:  The  General  Form  of  a  System 

1  Introduction 

Multilevel  security  is  the  aspect  of  computer  security  concerned  with  protecting  information 
that  is  classified  with  respect  to  a  multilevel  hierarchy  (e.g.,  UNCLASSIFIED,  SECRET, 
TOP  SECRET).  A  probabilistic  system  is  a  hardware  or  software  system  that  makes  proba¬ 
bilistic  choices  (e.g.,  by  consulting  a  random  number  generator)  during  its  execution.  Such 
probabilistic  choices  are  useful  in  a  multilevel  security  context  for  introducing  noise  to  reduce 
the  rate  of  (or  eliminate)  illicit  communication  between  processes  at  different  classification 
levels.  In  this  paper,  we  are  concerned  with  definitions  of  perfect  (information-theoretic) 
multilevel  security  in  the  sense  that  the  definitions  rule  out  all  illicit  communication  without 
relying  on  any  complexity-theoretic  assumptions.  That  is,  our  model  allows  the  system  pen- 
etrators  to  have  unlimited  computational  power;  yet,  our  definitions  are  sufficient  to  ensure 
there  can  be  no  illicit  communication. 

The  systems  we  address  can  be  depicted  in  the  form  shown  in  Figure  1.  This  general  form 
is  intended  to  represent  systems  including  physical  hardware  with  hard-wired  connections 
to  other  systems,  an  operating  system  kernel  with  connections  to  other  processes  provided 
by  shared  memory,  and  processes  executing  on  a  multiprocessor  with  connections  to  other 
systems  provided  by  an  interprocess  communication  (IPC)  mechanism. 

•  There  is  a  system,  called  E,  that  provides  services  to  the  other  systems.  For  example, 
in  the  case  of  a  multiuser  relational  database,  E  would  store  and  control  access  to  a  set 
of  relations.  E  is  the  system  with  respect  to  which  we  will  be  reasoning  about  multilevel 
security. 

•  There  is  a  set  of  systems  (labeled  Sx,  S2,  ■  ■ .,  S)  in  the  figure),  called  the  “covert  senders”, 
that  have  access  to  secret  information.  These  systems  are  called  “covert  senders”  because 
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they  may  attempt  to  covertly  send  secret  information,  via  E,  to  other  systems  that  are  not 
authorized  to  see  the  information.  It  is  these  attempts  with  which  we  are  concerned.  As  is 
commonly  done  in  the  literature,  we  will  often  refer  to  the  covert  senders  as  high  systems 
(referring  to  the  situation  where  the  covert  senders  have  access  to  highly  classified  informa¬ 
tion).  We  will  also  refer  to  the  set  of  covert  senders  collectively  as  the  high  environment, 
denoted  7i.  These  systems  are  part  of  “the  environment”  in  the  sense  that  they  are  in  the 
environment  of  the  central  system,  E. 

•  There  is  a  second  set  of  systems  (labeled  R\,  R2, . . .,  Rj  in  the  figure),  called  the  “covert 
receivers” ,  that  are  not  authorized  to  see  the  secret  information  that  is  available  to  the  covert 
senders.  We  will  often  refer  to  the  covert  receivers  as  loiu  systems,  or  collectively  as  the  low 
environment ,  denoted  C. 

If  the  covert  senders  are  able  to  use  E  to  communicate  information  to  the  covert  receivers, 
we  will  say  that  E  has  a  covert  channel ,  or  equivalently,  for  our  purposes,  that  E  is  insecure. 
A  few  notes  are  in  order. 

1.  It  is  important  to  bear  in  mind  that  the  threat  that  we  are  concerned  with  is  not  that 
the  users  (i.e.,  the  human  users)  of  the  covert  sender  systems  are  attempting  to  send 
secret  information  to  the  covert  receivers.  We  assume  that  if  they  wanted  to,  they  could 
more  easily  pass  notes  in  the  park  and  entirely  bypass  E.  Rather,  we  are  concerned  that 
the  covert  senders  are  actually  trojan  horses  (i.e.,  they  appear  to  be  something  that  the 
user  wants,  but  actually  contain  something  else  that  is  entirely  undesirable  to  the  user) 
and  that  these  trojan  horses  are  attempting  to  send  secret  information  to  the  covert 
receivers.  This  is  a  legitimate  concern  since  system  developers  do  not  want  to  incur  the 
cost  of  verifying  every  component  of  a  conglomerate  system  with  respect  to  multilevel 
security  requirements.  Ideally,  only  a  small  number  of  components  in  the  system  (e.g., 
in  our  case  only  E)  have  security  requirements  and  thereby  require  verification;  while 
the  remaining  components  can  be  implemented  by  off-the-shelf  hardware  and  software 
that  are  unverified  with  respect  to  security  (and  therefore  may  be  trojan  horses). 

We  assume  a  worst  case  scenario,  where  all  of  the  covert  senders  and  covert  receivers 
are  trojan  horses.  Indeed,  we  assume  that  all  of  the  trojan  horses  are  cooperating  in 
an  attempt  to  transmit  information  from  the  covert  senders  to  the  covert  receivers. 

2.  It  is  also  important  to  bear  in  mind  that  in  our  intended  application,  the  covert  senders 
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will  not  be  able  to  communicate  directly  to  the  covert  receivers  (i.e. ,  by  bypassing  E). 
Typically,  there  are  software,  hardware  or  other  physical  controls  to  prevent  this.  For 
example,  non-bypassability  is  one  of  the  well-known  principles  of  a  “reference  monitor” 
(see  [13]),  which  is  one  of  the  typical  applications  we  have  in  mind. 

3.  Our  model  contrasts  sharply  with  much  other  work  on  security  (e.g.,  [31],  [11])  in 
that  we  consider  a  set  of  untrusted  agents  (viz,  the  covert  senders  and  receivers)  that 
are  connected  via  a  trusted  agent,  whereas  these  other  works  consider  a  set  of  trusted 
agents  connected  via  an  untrusted  agent.  This  difference  in  our  model  reflects  the 
difference  in  the  respective  applications.  The  work  of  Meadows  in  [31]  and  Dolev  et 
al.  in  [11]  is  intended  for  the  analysis  of  a  set  of  legitimate  (and  trusted)  agents  that 
are  attempting  to  establish  secure  communication  over  an  untrusted  network.  In  that 
work,  the  assumption  is  that  the  penetrator  is  able  to  subvert  the  network  (i.e.,  the 
central  component  of  the  system),  but  not  the  trusted  (lateral)  agents. 

In  contrast,  our  work  is  intended  to  be  used  to  analyze  a  centralized  server  that  serves 
a  set  of  untrusted  entities.  Correspondingly,  our  assumption  is  that  the  penetrator 
may  be  able  to  subvert  the  untrusted  (lateral)  agents,  but  not  the  central  server. 

4.  The  fact  that  we  have  partitioned  the  set  of  systems  external  to  E  into  two  sets, 
high  and  low,  may  seem  to  indicate  that  we  are  limiting  ourselves  to  two  levels  of 
information  (e.g.,  SECRET  and  UNCLASSIFIED).  However,  this  is  not  the  case.  In 
a  more  general  setting,  information  is  classified  (users  are  cleared,  resp.)  according  to 
a  finite,  partially  ordered  set  (see,  e.g.,  Denning’s  [9]);  that  is,  there  is  a  finite  set  of 
classification  levels  (clearance  levels,  resp.)  that  is  ordered  by  a  reflexive,  transitive, 
and  anti-symmetric  relation,  which  we  call  dominates.  (In  fact  this  set  forms  a  finite 
lattice.)  A  given  user  is  permitted  to  observe  a  given  piece  of  information  only  if  the 
user’s  clearance  dominates  the  classification  of  the  information.  In  the  case  where  there 
are  more  than  two  levels,  a  separate  analysis  would  be  performed  for  each  level,  x;  in 
each  analysis,  the  set  of  levels  would  be  partitioned  into  those  that  are  dominated  by 
x  (i.e.,  the  “low”  partition)  and  the  set  of  levels  that  are  not  dominated  by  x  (i.e.,  the 
“high”  partition).  Thus,  we  have  lost  no  generality  by  restricting  our  attention  to  two 
levels. 

The  motivation  for  reasoning  about  the  probabilistic  behavior  of  systems  has  appeared  in 
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examples  and  discussions  of  many  authors  (cf.  [4,  17,  26,  28,  30,  42]).  Essentially,  the 
motivation  is  that  it  is  possible  for  a  probabilistic  system  to  satisfy  many  existing  definitions 
of  security  (e.g.,  Sutherland’s  Nondeducibility  [40],  McCullough’s  Restrictiveness  [29],  etc.) 
and  still  contain  probabilistic  covert  channels. 

Our  long  term  goal  is  to  develop  a  logic  that  can  be  used  to  reason  about  the  multilevel 
security  of  a  given  system  E.  The  logical  definition  of  security  in  this  paper  delineates  ideal 
security  for  probabilistic  systems.  As  such  it  does  not  apply  to  real  systems,  which  are  too 
complex  to  be  simultaneously  ideally  secure  and  adequately  functional.  Nonetheless,  it  is 
important  to  establish  the  ideal  in  order  to  know  what  is  possible.  Further,  we  view  the 
present  work  as  a  step  in  the  direction  of  practical  verification  of  multilevel  security  of  real 
probabilistic  systems  (presumably  based  on  definitions  of  security  that  allow  some  limited 
information  flow). 

In  prior  work  ([18]),  we  gave  a  logic  in  which  an  information-theoretic  definition  of  security — 
the  first  author’s  Probabilistic  Noninterference  (PNI)  ([17]) — was  expressed  as 

i<l(v)  ->  M<f )  (i) 

where  KL(tp)  is  intuitively  regarded  as  UL  knows  <y”  and  RL (</?)  is  intuitively  regarded  as 
“L  is  permitted  to  know  c p.”  Thus,  Formula  1  is  intuitively  interpreted  as  “If  L  knows  ip 
then  L  is  permitted  to  know  <p” ,  or  in  other  words,  “what  L  knows  is  a  subset  of  what  L  is 
permitted  to  know”. 

This  intuitively-appealing  formula  was  proposed  by  Glasgow,  MacEwen,  and  Panangaden 
[14]  and  further  developed  by  Bieber  and  Cuppens  [1,  2],  In  other  work  [19]  we  extended 
their  approach  to  a  probabilistic  framework,  retaining  the  syntactic  form  of  their  definition 
of  security,  viz  Formula  1.  However,  the  knowledge  operator  {K  L)  proposed  by  Bieber  and 
Cuppens  (and  its  probabilistic  analog  proposed  by  us)  is  nonstandard  and  rather  unnatural. 
For  example,  what  a  subject  “knows”  does  not  change  over  time.  In  particular,  in  our 
probabilistic  framework,  subjects  “know”  the  probability  distribution  over  all  of  their  future 
interactions,  including  all  future  outputs  they  will  receive.  This  is  in  contrast  with  the 
intuitive  notion  of  knowledge  (as  well  as  the  standard  formalizations  of  knowledge  such  as 
by  Chandy  and  Misra  [6]  or  Halpern[21])  wherein  a  subject  can  acquire  knowledge  as  it 
interacts  with  its  environment. 

Another  disadvantage  of  the  prior  work  of  Glasgow  et  ah,  Bieber  and  Cuppens,  and  the 
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present  authors  is  that  Formula  1  makes  use  of  a  “permitted  knowledge”  operator  (i?L). 
Such  an  operator  has  no  standard  semantics  and  seems,  by  its  very  nature,  to  be  application 
specific.  For  example,  see  [8]  wherein  “permitted  knowledge”  is  essentially  formalized  as 
“knowledge  that  is  permitted,  as  defined  in  the  present  application”. 

In  the  present  work,  we  develop  a  new  formalization  of  PNI  using  the  framework  of  Halpern 
and  Tuttle  [22],  In  this  framework,  the  knowledge  operator  is  given  the  standard  semantics. 
Also,  our  new  formalization  does  not  make  use  of  a  “permitted  knowledge”  operator;  it  is 
therefore  free  of  the  nonstandard  operators  that  were  used  in  our  previous  formalization. 
Thus,  the  present  paper  can  be  viewed  as  superseding  our  prior  work. 

In  another  sense,  the  present  work  can  be  viewed  as  a  novel  application  of  Halpern  and  Tut¬ 
tle’s  framework,  since  we  instantiate  their  framework  with  an  adversary  (see  Definition  2.1) 
fundamentally  different  from  those  described  in  [22], 

The  remainder  of  the  paper  is  organized  as  follows.  In  §2  we  set  out  our  model  of  computa¬ 
tion.  In  § §3  and  4,  we  set  out  the  syntax  and  semantics  of  our  logic  and  in  §5,  we  prove  its 
soundness.  In  §6  we  state  our  primary  definition  of  security  and  prove  that  it  is  equivalent 
to  Probabilistic  Noninterference.  In  §7  we  state  our  verification  condition  and  show  that  it 
is  equivalent  to  the  Applied  Flow  Model.  Finally,  in  §8,  we  give  some  conclusions  of  this 
work. 


2  System  Model 

In  this  section,  we  describe  our  system  model.  This  is  the  model  by  which  we  will  (in  §4) 
give  semantics  to  our  logic.  First,  we  describe  the  general  system  model,  which  is  taken 
from  Halpern  and  Tuttle  [22],  The  framework  of  Halpern  and  Tuttle  builds  on  the  work  of 
Fagin  and  Halpern  in  [12],  It  also  encompasses  earlier  work  of  Ruspini  in  [35].  After  giving 
the  general  system  model,  we  tailor  the  model  to  our  needs  by  imposing  some  additional 
structure  on  the  model  and  (in  Halpern  and  Tuttle’s  terminology)  choosing  the  “adversaries” , 
resulting  in  our  application-specific  model. 
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2.1  General  System  Model 

We  have  a  set  of  agents,  P\ .  P>.. . . .  P,,.  each  with  its  own  local  state.  The  global  state  is 
an  n-tuple  of  the  local  agents’  states.1  A  run  of  the  system  is  a  mapping  of  times  to  global 
states.  We  assume  that  time  is  discrete  because  we  are  dealing  with  security  at  the  digital 
level  of  the  system.  We  are  not,  for  example,  addressing  security  issues  such  as  analog 
channels  in  hardware.  Therefore,  we  will  assume  that  times  are  natural  numbers. 

The  probabilities  of  moving  among  global  states  are  represented  in  the  model  by  means  of 
labeled  computation  trees.  The  nodes  of  the  trees  represent  global  states.  For  any  given 
node  in  a  tree,  the  children  of  that  node  represent  the  set  of  global  states  that  could  possibly 
come  next.  Each  arc  from  a  node  to  one  of  its  children  is  labeled  with  the  probability  of 
moving  to  that  state.  Thus,  from  any  given  node,  the  sum  of  the  probabilites  on  its  outgoing- 
arcs  must  be  one.  We  also  assume  the  set  of  outgoing  arcs  is  finite  and  that  all  arcs  are 
labeled  with  nonzero  probabilities.  This  final  assumption  can  be  viewed  as  a  convention 
that  if  the  probability  of  moving  from  state  x  to  state  y  is  zero,  then  state  y  is  not  included 
as  a  child  of  state  x. 

Certain  events  in  a  system  may  be  regarded  as  nonprobabilistic ,  while  still  being  nondeter- 
ministic.  The  typical  example  occurs  when  a  user  is  to  choose  an  input,  and  in  the  analysis 
of  the  system  we  do  not  wish  to  assign  a  probability  distribution  to  that  choice;  in  such 
cases,  we  regard  that  choice  as  nonprobabilistic.  All  nonprobabilistic  choices  in  the  system 
are  lumped  into  a  single  choice  that  is  treated  as  being  made  by  an  “adversary”  prior  to  the 
start  of  execution.  Thus,  after  this  choice  is  made,  the  system’s  execution  is  purely  prob¬ 
abilistic.  In  Halpern  and  Tuttle’s  words,  the  nonprobabilistic  choices  have  been  “factored 
out” . 

In  the  model  of  computation,  each  possible  choice  by  the  adversary  corresponds  to  a  labeled 
computation  tree.  In  other  words,  a  system  is  represented  as  a  set  of  computation  trees, 
each  one  corresponding  to  a  different  choice  by  the  adversary.  There  is  no  indication  how 
the  adversary’s  choice  is  made,  just  that  it  is  made  once  and  for  all,  prior  to  the  start  of 

1  Halpern  and  Tuttle  also  include  the  state  of  the  environment  as  part  of  the  global  state.  In  their  usage 
of  the  term,  the  “environment”  is  intended  “to  capture  everything  relevant  to  the  state  of  the  system  that 
cannot  be  deduced  from  the  agents’  local  states”  [22,  §2],  This  typically  includes  messages  in  transit  on 
the  communication  medium.  However,  we  model  such  things  as  part  of  the  covert  senders’  and  receivers’ 
local  states;  we  therefore  omit  what  they  call  the  environment  from  our  model.  In  contradistinction,  we 
refer  to  everything  external  to  S  as  “the  environment”;  viz,  the  covert  senders  and  receivers  constitute  the 
environment. 
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execution. 


2.2  Application-Specific  System  Model 

In  this  section,  we  impose  some  additional  structure  on  the  general  model  described  in  the 
previous  section.  We  fix  the  set  of  agents,  fix  our  model  and  intuitions  regarding  commu¬ 
nication,  place  some  (environmental)  constraints  on  the  agents,  and  fix  the  set  of  choices 
available  to  the  adversary. 

AGENTS  As  indicated  in  Figure  1  and  the  surrounding  discussion,  we  can  limit  our  model 
to  three  agents:  (1)  the  system  under  consideration,  denoted  E,  (2)  the  covert  senders  (or 
alternatively,  the  high  environment),  denoted  Ti,  and  (3)  the  covert  receivers  (or  alternatively, 
the  low  environment),  denoted  C.  In  the  remainder  of  the  paper,  we  will  tacitly  assume  that 
the  global  system  is  comprised  of  these  three  agents. 

MODEL  OF  COMMUNICATION  Our  model  of  communication  is  similar  to  those  of 
Bieber  and  Cuppens,  Millen,  and  the  first  author  (cf.  [2],  [32],  and  [17],  respectively).  We 
view  E’s  interface  as  a  collection  of  channels  on  which  inputs  and  outputs  occur.  Since  we 
consider  the  agent  "H  (resp.,  C)  to  consist  of  all  processing  that  is  done  in  the  high  (resp., 
low)  environment,  including  any  communication  mechanism  that  delivers  messages  to  E,  we 
will  not  need  to  model  messages  in  transit  or,  in  Halpern  and  Tuttle’s  terminology,  the  state 
of  the  environment;  rather,  these  components  of  the  global  state  will  be  included  as  part  of 
Ws  and  £’s  state. 

In  many  systems  of  interest,  the  timing  of  events  is  of  concern.  (See  Lampson”s  [25]  for  an 
early  description  of  covert  communication  channels  that  depend  on  timing.)  In  particular, 
some  covert  communication  channels  depend  on  a  clock  being  shared  between  the  covert 
senders  and  receivers.  Such  channels  are  typically  called  timing  channels ;  see  Wray’s  [43] 
for  examples  and  discussion.  To  handle  such  cases,  we  take  the  set  of  times  (i.e.,  the  domain 
of  the  runs)  to  be  the  ticks  of  the  best  available  shared  clock.2  Events  occurring  between 
two  ticks  are  regarded  as  occurring  on  the  latter  tick.  This  is  sufficient  for  the  purposes  of 
our  analysis  because,  as  far  as  the  covert  senders  and  receivers  are  concerned,  this  is  the 
most  accurate  information  available.  Also  note  that  if  the  timing  of  certain  events  (wrt  the 
best  available  shared  clock)  is  nonprobabilistic,  we  can  consider  the  various  possibilities  to 

2 A  shared  clock  may  be  an  explicit  clock  supplied  by  X,  e.g,  the  system  clock ,  or  it  may  be  a  clock 
manufactured  by  the  covert  senders  and  receivers  for  their  own  purposes;  see  [43]  for  examples  and  discussion. 
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be  choices  that  are  made  by  the  adversary  and  factor  out  that  nondeterminism  as  discussed 
by  Halpern  and  Tuttle  [22], 

Since  the  mechanisms  of  high-level3  I/O  routines  may  introduce  covert  channels  (see,  e.g., 
McCullough’s  [28,  §2.3]),  we  take  a  very  low-level  view  of  I/O.  In  particular,  we  assume 
one  input  and  one  output  per  channel  per  unit  time  (where  times  are  chosen  according  to 
the  above  considerations).  That  is,  for  each  time  we  have  a  vector  of  inputs  (one  for  each 
channel)  and  a  vector  of  outputs  (one  for  each  channel).  If  a  given  agent  produces  no  new 
data  value  at  a  given  time,  it  may  in  fact  serve  as  a  signal  in  a  covert  channel  exploitation. 
Hence,  we  treat  such  “no  new  signal”  events  as  inputs.  Similarly,  we  do  not  consider  the 
possibility  that  the  system  can  prevent  an  input  from  occurring.  Rather,  the  system  merely 
chooses  whether  to  make  use  of  the  input  or  ignore  it.  Any  acknowledgement  that  an  input 
has  been  received  is  considered  to  be  an  output. 

Given  these  considerations,  we  fix  our  model  of  communication  as  follows.  We  assume  the 
following  basic  sets  of  symbols,  all  nonempty: 

C:  a  finite  set  of  input/output  channel  names,  cy, . . . ,  c*, 

/:  representing  the  set  of  input  values, 

O:  representing  the  set  of  output  values. 

N+:  representing  the  set  of  positive  natural  numbers.  This  set  will  be  used  as  our  set  of 
“times” . 

Since  there  is  one  input  per  channel  at  each  time,  we  will  be  talking  about  the  vector  of 
inputs  that  occurs  at  a  given  time.  We  will  denote  the  set  of  all  vectors  of  inputs  by  I[C\. 
Typical  input  vectors  will  be  denoted  ft,  ft',  fty, . . .  G  I[C\. 

Similarly,  we  will  denote  the  set  of  all  output  vectors  by  0[C]  and  typical  output  vectors 
will  be  denoted  b,b',bi, . . .  €  0[C}. 

Now,  to  talk  about  the  history  of  input  vectors  up  to  a  given  time,  we  introduce  notation 
for  traces.  We  will  denote  the  set  of  input  traces  of  length  k  by  Ic,k ■  Mathematically,  Ic,k 
is  a  shorthand  for  the  set  of  functions  from  C  x  {1,2,...  A:}  to  I.  Therefore,  for  a  trace 
a  G  Ic,k i  we  will  denote  the  single  input  on  channel  cGCat  time  k'  <  k  by  a(c, ,  A:'). 

3In  this  context,  “high-level”  means  highly  abstract  rather  than  highly  classified. 
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We  will  also  need  to  talk  about  infinite  traces  of  inputs.  For  this  we  use  the  analogous 
notation  Ic,  oo,  which  is  short  hand  for  the  set  of  functions  from  C  x  N+  to  /. 

Similarly,  we  will  denote  the  set  of  output  traces  of  length  k  by  Oa.k  and  the  set  of  infinite 
output  traces  by  Oq, oo-  Naturally,  for  an  output  trace  fi.  fi(c,k)  represents  the  output  on 
channel  c  at  time  k. 

There  will  be  situations  where  we  want  to  talk  about  vectors  or  traces  of  inputs  or  outputs 
on  some  subset  of  the  channels,  S  C  C.  In  such  cases  we  will  use  the  natural  generalizations 
of  the  above  notations,  viz,  /[£>],  Is, 001  etc. 

ENVIRONMENTAL  CONSTRAINTS  Any  given  agent  will  be  able  to  see  the  inputs 
and  outputs  on  a  subset  of  the  system’s  I/O  channels.  We  make  this  precise  by  “restricting” 
vectors  and  traces  to  subsets  of  C.  Given  an  input  vector  a  G  I[C]  and  a  set  of  channels 
S  C  C,  we  define  a  \  S  G  I[S]  to  be  the  input  vector  on  channels  in  S  such  that  a  \  S(c)  = 
a(c )  for  all  c  G  S. 

Similarly,  given  an  input  trace  a  G  Ic,k  and  a  set  of  channels  S  C  C,  we  define  a  \  S  G  Is.k 
to  be  the  input  trace  for  channels  in  S  such  that  a  \  S(c,  k')  =  a(c ,  k ')  for  all  c  G  S  and  all 
k'  <  k. 

We  assume  that  the  set  of  low  channels,  denoted  L ,  is  a  subset  of  C.  Intuitively,  L  is  the 
set  of  channels  that  the  low  environment,  C ,  is  able  to  directly  see.  In  particular,  C  is  able 
to  see  both  the  inputs  and  the  outputs  that  occur  on  channels  in  L. 

In  practice,  there  will  be  some  type  of  physical  or  procedural  constraints  on  the  agent  C  to 
prevent  it  from  directly  viewing  the  inputs  and  outputs  on  channels  in  C  —  L.  For  example, 
those  channels  may  represent  wires  connected  to  workstations  that  are  used  for  processing- 
secret  data.  In  this  case,  the  secret  workstations  might  be  located  inside  a  locked  and  guarded 
room.  In  addition,  periodic  checks  of  the  wires  might  be  made  to  ensure  that  there  are  no 
wiretaps  on  them.  In  this  way,  C  is  prevented  from  directly  viewing  the  data  that  passes 
over  the  channels  in  C  —  L. 

On  the  other  hand,  we  place  no  constraints  on  the  set  of  channels  that  H  is  able  to  see.  In 
particular,  we  make  the  worst-case  assumption  that  H  is  able  to  see  all  inputs  and  outputs 
on  all  channels. 

The  above  considerations  are  consistent  with  what  we’ve  called  the  “Secure  Environment 
Assumption”  in  previous  work  [17,  18].  In  the  present  paper,  this  assumption  is  made  precise 
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in  terms  of  our  definition  of  the  adversary  to  be  given  next. 

THE  ADVERSARY  As  discussed  above,  in  Halpern  and  Tuttle’s  framework,  all  nonprob- 
abilistic  choices  are  factored  out  of  the  execution  of  the  system  by  fixing  an  adversary  at 
the  start  of  execution.  To  make  use  of  this  framework,  we  must  define  the  set  of  possible 
adversaries  from  which  this  choice  is  made. 

The  “adversary”  in  our  application  is  the  pair  of  agents,  74  and  C.  that  are  attempting  to 
send  data  from  the  high  environment  across  the  system  E  to  the  low  environment.  To  be 
fully  general,  we  model  these  agents  as  mixed  strategies  (in  the  game-theoretic  sense).  That 
is,  at  each  point  in  the  execution  of  the  system  the  strategy  gives  the  probability  distribution 
over  the  set  of  next  possible  inputs,  conditioned  on  the  history  up  to  the  current  point.  In 
the  next  section,  we  present  an  example  to  motivate  the  need  for  such  generality.  Before 
doing  that,  we  make  the  adversary  precise  with  the  following  two  definitions. 

Definition  2.1  An  adversary  is  a  conditional  probability  function ,  A(a  \  a,  0)  (where  a  G 
I[C]  and  for  some  time,  k,  a  G  Ic,k  and  / 3  G  Oc,k )•  Intuitively,  the  adversary  describes  the 
environment’s  conditional  distribution  on  the  next  input  vector,  given  the  previous  history 
of  inputs  and  outputs.  By  saying  that  A(a  |  a,  (3)  is  a  conditional  probability  function  we 
require  that 

•  0  <  A(a  |  a,  0)  <  1,  and 

•  ^(a  I  0)  =  1 

a 

In  fact,  it  is  trivial  to  define  a  conditional  probability  mass  function  corresponding  to  A  where 
a,  a,  and  0  are  replaced  with  the  values  of  the  corresponding  random  variables  [33].  Such  a 
conditional  probability  mass  function  can  be  defined  in  terms  of  the  probability  measure  //^ 
given  in  definition  4.4  below.  □ 

Definition  2.2  We  say  that  an  adversary  A  satisfies  the  Secure  Environment  Assumption 
ivith  respect  to  a  set  of  channels  L  C  C  iff  there  exists  a  pair  of  conditional  probability 
functions  7 4  and  C  such  that  for  all  a  G  J[C],  all  k  G  N+,  all  a  G  I c,k  ■  and  all  0  G  Oc,k , 

A(a  |  a,  0)  =  74  (a  \  (C  —  L)  \  a,  0)  ■  C(a  \  L  \  a  \  L,  0  \  L) 

(where  •  denotes  real  multiplication).  □ 
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The  Secure  Environment  Assumption  can  be  intuitively  understood  as  saying  that  the  input 
on  channels  in  (C  —  L)  at  time  k  is  (conditionally)  statistically  independent  of  the  input  on 
channels  in  L  at  time  A:,  and  the  input  on  channels  in  L  at  time  k  depends  only  on  previous 
inputs  and  outputs  on  channels  in  L.  For  the  remainder  of  this  paper,  we  will  assume 
that  all  adversaries  from  which  the  initial  choice  is  made  satisfy  the  Secure  Environment 
Assumption. 

Later  in  this  section,  we  describe  how  a  given  adversary  A  and  the  description  of  a  particular 
system,  E,  are  used  to  construct  the  corresponding  computation  tree  TA.  Since  there  is  one 
tree  for  each  possible  adversary,  we  can  think  of  the  set  of  trees  as  being  indexed  by  the 
adversaries.  Therefore,  we  will  often  write  TA,  TAi,  TAv  etc. 

It  is  clear  that  for  an  adversary  A  that  satisfies  the  Secure  Environment  Assumption  (wrt 
L ),  the  conditional  probability  functions  Ti  and  C  are  unique.  Further,  given  %  and  C ,  there 
is  a  unique  adversary,  A,  for  which  %  and  C  are  the  probability  functions  that  satisfy  the 
corresponding  constraint.  There  is  therefore  no  ambiguity  in  writing  T-u.c-  T-w .c ■  etc.  when 
we  want  to  refer  to  the  components  of  the  adversary  individually. 

Note  that  our  definition  of  an  adversary  is  not  meant  to  be  as  general  as  the  adversary 
discussed  by  Halpern  and  Tuttle.  (In  fact,  Halpern  and  Tuttle  give  no  structure  at  all 
to  their  adversary.)  Rather,  our  adversary  is  application-specific;  in  particular,  it  is  for 
reasoning  about  multilevel  security  of  probabilistic  systems  and  is  not  designed  to  be  used 
outside  that  domain. 

On  the  other  hand,  this  particular  adversary  represents  a  novel  application  of  Halpern  and 
Tuttle’s  framework.  In  Halpern  and  Tuttle’s  examples,  the  adversary  represents  one  or  both 
of  two  things: 

•  the  initial  input  to  the  system;  and 

•  the  schedule  according  to  which  certain  events  (e.g.,  processors  taking  steps)  occur. 

In  contrast,  our  adversary  does  not  represent  a  given  input  to  the  system.  Rather,  it  repre¬ 
sents  a  mixed  strategy  for  choosing  the  inputs  to  the  system.  In  some  sense,  we  can  think 
of  this  as  a  generalization  on  the  first  item  above;  however,  our  application  still  fits  within 
the  framework  set  out  by  Halpern  and  Tuttle. 
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THE  STATE  OF  THE  SYSTEM  At  any  given  point,  P,  in  any  given  computation  tree, 
ta.  there  should  be  a  well-defined  state  of  the  system.  For  our  purposes,  the  state  includes 
the  following  information. 


1.  All  inputs  and  outputs  that  have  occurred  on  all  channels  up  to  the  current  time. 

2.  The  adversary.  In  [22],  Halpern  and  Tuttle  make  the  assumption  that  all  points  in 
all  trees  are  unique.  They  suggest  (and  we  adopt)  the  following  idea  to  ensure  that 
this  is  true.  The  state  encodes  the  adversary.  That  is,  all  nodes  in  tree  T 4  encode  A. 
Note  that  we  do  not  assume  that  any  given  agent  knows  the  adversary;  just  that  it  is 
somehow  encoded  in  the  state.  We  can  think  of  the  high  part  of  the  adversary,  'H.  as 
being  encoded  in  the  high  environment  and  the  low  part,  £,  as  being  encoded  in  the 
lowT  environment. 

3.  Additional  components  of  the  global  state  represent  the  internal  state  of  E.  For  exam¬ 
ple,  in  describing  E,  it  is  often  convenient  to  use  internal  state  variables.  The  state  of 
these  variables  can  be  thought  of  as  a  vector  of  values,  one  value  for  each  state  variable. 
Thus,  the  internal  state,  when  it  exists,  will  be  denoted  c,  and  the  history  of  internal 
states  will  be  denoted  7. 

COMPUTATION  TREES  Now  that  we  have  set  out  the  possible  states  of  the  system 
(i.e. ,  the  points  of  computations),  we  can  talk  about  the  construction  of  the  computation 
trees. 

For  each  reachable  point,  P,  we  assume  that  E’s  probability  distribution  on  outputs  is  given. 
For  example,  this  can  be  given  by  a  conditional  probability  distribution,  0(b,c  \  a,  [3, 7), 
where  «,  /I,  and  7  give  the  history  (up  through  some  time  k )  of  inputs,  outputs,  and  internal 
states,  respectively,  c  is  a  vector  of  internal  state  variables  (i.e.,  the  internal  system  state  at 
time  k  +  1),  and  b  is  the  vector  of  outputs  produced  by  the  system  (at  time  k  +  1). 

Given  0(b ,  c  \  a ,  /i,  7)  and  the  adversary  A,  we  can  construct  the  corresponding  computation 
tree  by  starting  with  the  initial  state  of  the  system  (i.e.,  the  point  at  the  root  of  the  tree 
with  empty  histories  of  inputs,  outputs,  etc.)  and  iteratively  extending  points  as  follows. 

Let  P  be  a  point  in  the  tree  with  internal  system  history  7,  input  history  a,  and  output 
history  0.  We  will  make  P’  a  child  of  P  iff 
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1.  P'  is  formed  from  P  by  modifying  the  internal  system  state  to  c  and  extending  P’s 
input  history  (output  history,  resp.)  with  a  (b,  resp.);  and 

2.  both  0(b,  c  |  ft,  ft,  7)  and  A(a  |  ft,  ft)  are  positive. 

In  such  cases,  we  label  the  arc  from  P  to  P'  with  0(b,c  |  ft,/3, 7)  •  A(a  |  a,  ft),  i.e. ,  the 
system,  E,  and  the  environment,  A,  make  their  choices  independently. 

RUNS  OF  THE  SYSTEM  A  run  of  the  system  is  an  infinite  sequence  of  states  along  a 
path  in  one  of  the  computation  trees.  When  we  want  to  talk  about  the  particular  run,  p, 
and  time,  A:,  at  which  a  point  P  occurs,  we  will  denote  the  point  by  the  pair  (p,  k ).  Further, 
if  we  wish  to  talk  about  the  various  components  of  the  run,  i.e.,  the  trace  of  the  inputs,  ft, 
outputs,  ft,  or  other  variables,  7,  we  will  denote  the  run  by  (a,  ft,  7)  and  denote  the  point, 
P,  by  {a,  ft,  7,  k). 

For  a  given  tree,  T ,  we  denote  the  set  of  runs  (i.e.,  infinite  sequences  of  states),  formed  by 
tracing  a  path  from  the  root,  by  runs(T). 

For  security  applications  we  are  concerned  with  information  flow  into  and  out  of  the  system 
rather  than  with  information  in  the  system  per  se.  Thus,  though  our  system  model  is 
adequate  to  represent  internal  states  and  traces  thereof,  in  subsequent  sections  it  will  be 
adequate  to  represent  systems  entirely  in  terms  of  input  and  output.  In  particular,  system 
behavior  can  be  represented  by  lO(b  \  a,  ft)'  rather  than  lO(b,c  \  a,  ft,  7)’. 

3  Syntax 

In  this  section  we  set  out  our  formal  language  and  use  it  to  describe  two  simple  systems. 
Then  we  give  the  axioms  and  rules  of  our  logic. 

3.1  Formation  Rules 

To  describe  the  operation  of  the  system  under  consideration  (viz,  E),  we  use  a  variant  of 
Lamport’s  Raw  Temporal  Logic  of  Actions  (RTLA)  [24], 4  The  primary  difference  is  that  we 

4Roughly  speaking,  Raw  Temporal  Logic  of  Actions  (RTLA)  is  the  same  as  Lamport’s  Temporal  Logic 
of  Actions  (TLA)  without  the  treatment  of  stuttering  [24],  Since  we  are  not,  in  this  paper,  concerned  with 
refinement,  we  omit  the  considerations  of  stuttering  and  use  RTLA. 
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add  a  modal  operator  Pri(p)  that  allows  us  to  specify  and  reason  about  the  probabilistic 
behavior  of  the  system. 

From  the  previous  section,  we  assume  the  following  basic  sets  of  symbols,  all  nonempty:  C, 
I,  O,  and  M.  Members  of  1  will  have  the  usual  representation — e.g.,  43.5  G  M. 

We  will  also  be  talking  about  the  subjects  (or  agents)  of  the  system.  Formally,  a  subject , 
S'  C  C,  is  identified  with  the  process’s  view  of  the  system,  i.e.  the  set  of  channels  on  which 
it  can  see  the  inputs  and  outputs. 

Formulae  in  the  language  are  built  up  according  to  the  following  rules. 

•  constants  from  the  set  of  basic  symbols  are  terms. 

•  state  variables  (representing  the  value  of  that  variable  in  the  current  state)  are  terms. 
Among  the  state  variables,  there  are  two  reserved  for  each  communication  channel. 
For  each  c  <G  C.  we  have  a  state  variable  cin  that  takes  values  from  /,  and  another 
state  variable  cout  that  takes  values  from  0 .  Note  that,  implicitly,  inputs  are  from  the 
covert  senders  and  receivers  into  the  system  (S)  and  outputs  are  from  the  system  to  the 
covert  senders  and  receivers.  This  is  because  S  is  the  system  under  consideration  (i.e., 
with  respect  to  which  we  are  reasoning  about  security).  We  have  no  mechanism  (and 
no  need)  to  specify  communication  between  agents  not  including  the  system  under 
consideration. 

•  primed  state  variables  (e.g.,  c'„)  are  terms.  (These  represent  the  value  of  the  variable 
in  the  next  state.) 

•  We  use  standard  operators  among  terms  (e.g.,  +  and  •  for  addition  and  multiplication, 
respectively),  with  parentheses  for  grouping  subterms,  to  form  composite  terms. 

•  an  atomic  formula  is  an  equation  or  inequality  among  terms. 

•  For  any  formula  <p,  □<£  is  a  formula  (to  be  read  intuitively  as  always  p). 

•  We  build  up  composite  formulae,  in  the  usual  recursive  fashion  using  A,  V,  and  — h 

•  for  any  nonmodal  formula5  p,  and  for  any  subject  S  C  C.  Prs(p)  is  a  real- valued  term. 
Intuitively,  Prs{p )  represents  the  subjective  probability  that  S  assigns  to  the  formula 

5  A  nonmodal  formula  is  a  formula  that  does  not  contain  any  knowledge  or  temporal  operators. 


15 


ip,  that  is,  the  probability  of  p.  given  the  previous  history  of  inputs  and  outputs  on 
channels  in  S.  We  refer  to  Prc(p)  (where  C  is  the  set  of  all  communication  channels)  as 
the  objective  probability  of  <p,  since  it  represents  the  probability  of  ip  given  all  available 
information,  i.e.,  the  unbiased  probability  of  tf>. 

To  specify  and  reason  about  our  security  properties  of  interest,  we  also  add  a  set  of  modal 
operators  on  formulae:  K\ . . . . ,  Kn ,  representing  knowledge  for  each  subject  (represented  by 
the  subscript  of  the  operator).  Therefore,  we  add  the  following  additional  formation  rule  to 
our  syntax. 

•  For  any  formula  p,  and  for  any  subject  S  C  C,  Ks(p)  (representing  that  S  knows  p) 
is  a  formula. 

Note  that  this  and  previous  rules  are  mutually  recursive;  so,  we  can  express,  e.g.,  that  S 
always  knows  that  x  =  5. 

3.2  Examples 

We  now  give  two  simple  examples  of  how  to  describe  systems  in  our  language.  Ultimately, 
we  will  have  sufficient  formal  machinery  to  show  that  one  of  these  systems  is  secure  and  the 
other  is  not:  however,  here  we  simply  set  them  out  formally.  These  descriptions  are  meant 
to  give  the  reader  an  intuitive  feel  for  the  meaning  of  expressions  in  the  language.  Precise 
meanings  will  be  given  in  §4.  Also,  the  second  of  these  examples  will  motivate  our  choice  to 
model  adversaries  as  strategies. 

Example  3.1  The  first  example  is  a  simple  encryption  box  that  uses  a  “one-time  pad”  [10]. 
It  has  two  channels,  high  and  low.  At  each  tick  of  the  system  clock,  it  inputs  a  0  or  1  on 
the  high  channel  and  outputs  a  0  or  1  on  the  low  channel.  The  low  output  is  computed  by 
taking  the  “exclusive  or”  (XOR)  (denoted  ©)  of  the  high  input  and  a  randomly  generated 
bit. 

Note  that  we  are  modeling  only  the  sender’s  (encrypting)  side  of  a  one-time  pad  system. 
Thus,  issues  such  as  how  the  random  bit  string  is  distributed  to,  and  used  by,  the  receiver’s 
(decrypting)  side  are  out  of  the  scope  of  this  specification. 
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It  is  well  known  that  the  XOR  of  a  data  stream  with  an  independent  uniformly  distributed 
bit  stream  results  in  an  output  stream  that  is  uniformly  distributed.  Therefore,  we  can 
describe  the  encryption  box  as  follows. 

Let  C  =  {hj},  I  =  {0, 1},  and  O  =  {0, 1}.  Then,  the  system  is  specified  by  the  following 
formula. 


n  (Prc(C,  =  0)  =  0.5  A  Prc(l'ml  =  1)  =  0,5) 

In  this  formula,  lout  is  a  state  variable  representing  the  output  on  the  low  channel,  l.  There¬ 
fore,  l'out  is  the  output  on  l  at  the  next  time.  Further,  Prc{l'out  =  0)  denotes  the  probability 
that  the  output  on  l  is  a  0  at  the  next  time.  Hence,  the  entire  formula  says  that  at  all  times, 
the  probability  of  S  producing  a  one  (1)  on  the  next  clock  tick  is  equal  to  the  probability 
of  producing  a  zero  (0),  which  is  equal  to  0.5.  Note  that  we  have  not  specified  the  probabil¬ 
ity  distribution  over  inputs,  since  this  constitutes  environment  behavior  rather  than  system 
behavior. 

□ 


Example  3.2  The  second  example  is  an  insecure  version  of  the  simple  encryption  box. 
Shannon  [37]  gives  an  early  description  of  this  system. 

As  in  the  first  example,  the  system  computes  the  “exclusive  or”  of  the  current  high  input  and 
a  randomly  generated  bit  and  outputs  that  value  on  the  low  channel  at  each  time.  However, 
in  this  system,  the  randomly  generated  bit  used  at  any  given  tick  is  actually  generated  and 
output  on  the  high  output  channel  during  the  previous  tick  of  the  clock. 

This  can  be  expressed  in  our  formalism  as  follows.  Let  C  =  { h ,  /},  /  =  {0, 1},  and  O  =  {0, 1}. 
The  following  formula  specifies  the  system. 

n(prc{h'out  =  0)  =  0.5  A  Prc{h'out  =  1)  =  0.5  A  l'out  =  hout  ©  h'J 

Note  that  in  the  third  conjunct,  hout  is  unprimed,  indicating  that  the  output  on  l  at  the  next 
time  is  the  “exclusive  or”  of  the  current  output  on  h  with  the  next  input  on  h. 

Now  note  that  if  the  high  agent  ignores  its  output,  this  system  acts  exactly  as  the  system 
from  the  previous  example  (and  can  be  used  for  perfect  encryption).  In  particular,  suppose 
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we  were  to  model  an  adversary  as  an  input  string — the  input  to  be  provided  by  the  high 
agent.  Then,  it  is  straightforward  to  prove  that  for  any  adversary  (i.e. ,  any  high  input  string) 
fixed  prior  to  the  start  of  execution,  the  output  to  low  will  be  uniformly  distributed  and,  in 
fact,  will  contain  no  information  about  the  high  input  string. 

However,  the  bit  that  will  be  used  as  the  one-time  pad  at  time  k  is  available  to  the  high  agent 
at  time  k  —  1.  Therefore,  (due  to  the  algebraic  properties  of  “exclusive  or”,  viz,  x@x@y  =  y) 
the  high  agent  can  use  this  information  to  counteract  the  encryption.  In  particular,  the  high 
agent  can  employ  a  (game-tlieoretic)  strategy  to  send  any  information  it  desires  across  the 
system  to  the  low  agent. 

For  example,  suppose  the  high  agent  wishes  to  send  a  sequence  of  bits,  We’ll 

denote  the  high  input  (resp.,  output)  at  time  k  by  hin(k )  (resp.,  hout(k)).  The  appropriate 
strategy  for  the  high  agent  is  as  follows. 

The  high  agent  chooses  its  input  for  time  k  +  1  as  hin(k  +  1)  =  hout(k )  ©  bk. 

Thus,  the  output  to  low  at  time  k  +  1,  denoted  lout{k  +  1)  is  computed  as  follows. 

lout(k  +  1)  =  hout(k )  ©  hin(k  +  1)  [by  the  system  description] 

=  hout(k )  ©  hout{k)  ©  bk  [by  the  high  strategy] 

=  bk  [by  the  properties  of  ©] 

Thus,  by  employing  the  correct  strategy,  the  high  agent  can  noiselessly  transmit  an  arbitrary 
message  over  S  to  the  low  agent.  This,  of  course,  motivates  our  choice  to  model  adversaries 
as  strategies,  rather  than,  e.g.,  input  strings. 

□ 


We  now  have  some  sense  of  the  formal  language,  with  the  exception  of  the  knowledge  operator 
/\  s-.  As  previously  mentioned,  this  operator  will  be  used  to  formalize  the  security  properties 
that  interest  us.  We  will  illustrate  that  use  in  a  later  section.  For  now  we  mention  that 
in  security  analyses  it  is  typical  to  assume  that  system  users  (and  penetrators)  know  how 
the  system  works  (i.e.,  its  specifications  are  not  secret);  we  make  such  assumptions  explicit 
using  our  knowledge  operator,  in  particular,  if  the  system  specification  is  given  by  a  formula 
(p,  we  will  assume  that  for  every  subject  S,  Ks{p). 
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3.3  The  Logic 


We  now  give  the  axioms  of  our  logic.  In  the  following,  we  will  use  ‘p’’  and  ‘  i/’’  to  refer  to 

formulae  of  our  language. 

Propositional  Reasoning  All  instances  of  tautologies  of  propositional  logic. 

Temporal  Reasoning  The  following  are  standard  axioms  for  temporal  reasoning  about 
discrete  systems.  The  logic  they  constitute  is  generally  called  S4.3Dum.  (See  Gold- 
blatt’s  [16]  for  details.)  We  have  labelled  the  axioms  with  their  historical  names.  Let 
p  and  'll’  be  formulae  of  our  language. 

K  (□(<£)  A  D(p  -a  '</>))  -a  □  if, 

T  np  -a  p 
4  □(£>  -A  □ 

L  □  (</>  A  Up  -A  '(/’)  V  1=1  (V’  A  □  '</’  -A  <^) 

Dum  □  (□(</?  -A  Hp)  -A  p)  -A  (ODip  -A  </>) 

can  be  interpreted  roughly  as  saying  that  at  some  point  p  is  true.  Formally,  it 
is  viewed  as  notational  shorthand:  for  all  formulae  p,  Op  =  -iD-k/j.  K  guarantees 
that  the  temporal  operator  respects  modus  ponens.  Each  of  the  other  axioms  captures 
a  feature  of  time  that  we  desire.  4  gets  us  transitivity.  T  guarantees  that  we  don’t 
run  out  of  time  points  (seriality)  and  that  temporal  references  include  the  present.  L 
guarantees  that  all  points  in  time  are  connected  (linearity).  And,  Dum  guarantees 
that  time  is  discrete.  (Between  any  two  points  in  time  there  are  at  most  finitely  many 
other  points;  see  Goldblatt’s  [16]  for  further  discussion.) 

Real  Number  Axioms  Standard  field  and  order  axioms  for  the  real  numbers  (to  apply  to 
members  of  M  and  function  terms  with  range  M.)  We  will  not  enumerate  these  axioms. 
(See  any  elementary  real  analysis  book  for  enumeration,  e.g.,  [27]  or  [34].) 

Epistemic  Reasoning  The  (nonredundant)  axioms  of  the  Lewis  system  S5  (cf.  Chellas, 
[7],  or  Goldblatt.  [16])  apply  to  the  knowledge  operators  (Ks)-  As  for  temporal  axioms, 
we  give  the  axioms  their  historical  names.  Let  S  be  a  subject,  and  let  p  and  A  be 
formulae  of  our  language. 
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K  (Ks{p)  A  Ks(ip  — )■  ip))  — >■  Ks{i')  (Knowledge  respects  modus  ponens.) 

T  Ks{p)  — >■  v?  (What  one  knows  is  true.) 

5  -i Ks{ifi)  — >■  Ks^Ks(p)  (If  you  don’t  know  something,  then  you  know  that  you 
don’t  know  it.) 

Random  Variable  Axioms  The  standard  requirements  for  random  variables  (in  the  prob¬ 
ability  theoretic  sense). 

PM  (Positive  Measure)  for  any  formula,  <p,  and  any  subject,  S,  Prs{p)  >  0 
(The  probability  of  any  event  is  greater  than  or  equal  to  zero.) 

NM  (Normalized  Measure)  for  any  channel,  c,  and  any  subject,  S', 

E«e,  Pv s{c in  =  ci)  =  1  (The  probability  of  all  possibilities  sums  to  one.) 

E&GO  Pr s{c0ut  =  b)  =  1 

Additional  Axioms  Since  our  logic  contains  three  different  modalities,  we  need  some  ax¬ 
ioms  to  describe  the  interactions  among  them.  The  following  are  not  intended  to  be 
complete  in  any  sense;  they  are  merely  sufficient  for  the  present  purposes. 

K  □  For  any  formula  ip  and  any  subject  S, 

Ks{Uip)  -►  n(KSip) 

(If  S  knows  something  is  always  true,  then  S  always  knows  it’s  true.) 

KPr  For  any  formula  ip,  any  subject  S,  and  any  real  number  r, 

Ks{Prc{p)  =  r)->  Ks{Pr s{p>)  =  r) 

(If  S  knows  the  objective  probability  of  ip,  then  S  knows  its  subjective  probability 
of  ip  and  the  two  probabilities  are  the  same.) 

The  above  are  all  of  our  axioms.  We  now  give  the  rules  of  our  logic,  which  are  both  standard. 

MP  (Modus  Ponens) 

From  p>  and  ip  — >■  ip  infer  p. 

Nec  (Necessitation)  This  rule  applies  to  both  of  our  modal  operators:  □  and  Ks,  (It  is 
called  ‘necessitation’  because  it  was  originally  applied  to  a  necessity  operator.) 

From  h  ip  infer  h  Op 

From  h  p  infer  h  Ksipp) 
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Note  that  in  the  above,  ‘b  p'  indicates  a  derivation  of  p  from  the  axioms  alone,  rather  than 
from  a  set  of  premises.  (Derivations  are  defined  below.)  Thus,  in  the  case  of  the  knowledge 
operator  (and  analogously  for  □)  Nec  says  that  if  p)  is  a  theorem  (derivable  without  any 
premises)  then  all  subjects  know  p. 

We  can  now  define  formal  derivations. 

Definition  3.3  Let  ,  be  a  finite  set  of  formulae  of  our  language.  A  finite  sequence  of 
formulae  p\,  p2,  b3i  •  •  ■ ,  pn  is  called  a  derivation  (of  <pn  from  ,  )  iff  each  pk  (k  =  1, . . . ,  n) 
satisfies  one  of  the  following: 

•  bfc  e  ) 

•  pk  is  an  axiom. 

•  pk  follows  from  some  theorem  by  Nec. 

•  For  some  i,j  <  A:,  pk  results  from  b*  and  bj  by  MP. 

We  write  ‘,  b  p'  to  indicate  a  derivation  of  <p  from  ,  ,  and  we  write  ‘b  p'  to  indicate  a 
derivation  of  p  from  the  axioms  alone.  □ 

This  completes  our  statement  of  the  formal  system. 

4  Semantics 

In  the  previous  section  we  presented  a  syntactic  system.  So  far  we  have  only  intuitive 
meanings  to  attach  to  this  formalism.  In  this  section  we  provide  semantics  for  our  system 
in  terms  of  the  Halpern-Tuttle  framework  and  our  application-specific  model  set  out  in  §2. 

4.1  Semantic  Model 

A  model  M  is  a  tuple  of  the  form: 

(®,  +i  mi  A,  W,  T ,  C,  /,  O,  v,  «i,...,  K|-p(c)|  ) 

Here,  M  and  its  operations  and  ordering  relation  gives  us  the  real  numbers;  W  is  the  set  of 
points  (i.e. ,  global  states  or  “worlds”);  T  is  the  set  of  labeled  computation  trees  (with  nodes 
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from  W);  C,  /,  and  0  are  the  sets  of  channels,  possible  inputs,  and  possible  outputs,  respec¬ 
tively;  v  is  the  assignment  function,  which  assigns  semantic  values  to  syntactic  expressions 
at  each  point;  (values  of  v  at  a  particular  point  P ,  will  be  indicated  by  the  projection  ‘up’); 
and  the  Kis  are  knowledge  accessibility  relations,  one  each  for  each  subject  S.  Essentially, 
two  points  are  accessible  for  a  given  subject  if  that  subject  cannot  distinguish  between  those 
two  points,  (i.e. ,  the  subject  does  not  “know”  which  of  the  points  he  is  in.)  We  describe 
these  accessibility  relation  precisely  in  the  next  section.  In  the  remainder  of  this  paper  we 
will  generally  denote  the  accessibility  relations  corresponding  to  subject  S  by  Xs’- 

In  assigning  meaning  to  our  language,  it  is  of  fundamental  importance  to  associate  a  proba¬ 
bility  space  with  each  labeled  computation  tree.  In  particular,  for  each  labeled  computation 
tree  T 4  we  will  construct  a  sample  space  of  runs,  7ZA,  an  event  space,  XA  (i.e.,  those  subsets 
of  7 ZA  to  which  a  probability  can  be  assigned),  and  a  probability  measure  jiA  that  assigns 
probabilities  to  members  of  XA. 

Our  construction  of  this  probability  space  is  quite  natural  and  standard  (see,  e.g.,  Seidel’s 
[36]  as  well  as  [22]  for  two  instances).  We  will  not  go  into  detail  explaining  the  basic  concepts 
of  probability  and  measure  theory  here  (cf.  [20]  or  [39]). 

Definition  4.1  For  a  labeled  computation  tree  ta.  the  associated  sample  space  1ZA  is  the 
set  of  all  infinite  paths  starting  from  the  root  of  TA-  □ 

Definition  4.2  For  any  sample  space  7ZA,  the  set  e  C  1ZA,  is  called  a  generator  iff  it 
consists  of  the  set  of  all  traces  with  some  common  finite  prefix.  Intuitively,  generators  are 
probability-theoretic  events  corresponding  to  finite  traces.  □ 

Definition  4.3  For  any  sample  space  1ZA,  we  define  the  event  space,  XA.  to  be  the 
(unique)  field  of  sets  generated  by  the  set  of  all  generators.  That  is,  XA  is  the  smallest 
subset  of  V{1 ZA)  that  contains  all  of  the  generators  and  is  closed  under  countable  union  and 
complementation.  □ 

Definition  4.4  We  define  the  probability  measure,  jiA.  on  XA  in  the  standard  way. 
Suppose  e  is  a  generator  corresponding  to  the  finite  prefix  given  by  (p,  k).  Then,  pA{e)  is 
defined  as  the  product  of  the  transition  probabilities  from  the  root  of  the  tree,  along  the 
path  p,  up  to  time  k.  Further,  it  is  well  known  that  there  is  a  unique  extension  of  pA  to  the 
entire  event  space  (cf.  [20]).  □ 
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We  will  be  rather  abusive  in  the  use  of  our  probability  measures  In  particular,  when 
we  have  a  finite  set  of  points,  x,  we  will  write  Pa(x)  to  denote  the  probability  (as  assigned 
by  (J-a)  °f  passing  through  one  of  the  points  in  x.  Technically,  this  is  wrong,  since  /ia  is 
defined  for  sets  of  runs;  not  for  sets  of  points.  However,  the  mapping  between  the  two  is 
extremely  natural;  the  set  of  runs  corresponding  to  a  set  of  points  is  the  set  of  all  runs  that 
pass  through  those  points.  Further,  by  the  construction  of  our  probability  spaces,  all  sets  of 
runs  corresponding  to  finite  sets  of  points  are  measureable.  Therefore,  there  is  no  danger  in 
this  abuse  of  notation  and  it  greatly  simplifies  our  presentation. 

4.2  Assignment  Function 

Given  the  above  semantic  model,  the  main  technical  question  we  need  to  address  in  assigning 
meaning  to  formulae  in  our  logic  is: 

For  a  given  subject  at  a  given  point  in  its  execution  (i.e. ,  at  a  given  node  in  a 
given  computation  tree),  what  sample  space  should  be  used  in  evaluating  the 
probability  that  subject  assigns  to  a  given  formula? 

As  discussed  by  Halpern  and  Tuttle,  after  choosing  these  sample  spaces,  assigning  meaning 
to  probability  formulae  is  straightforward.  Further,  assigning  meanings  to  nonprobability 
formulae  will  be  done  in  the  standard  ways,  so  that  too  will  be  straightforward. 

We  denote  the  sample  space  for  subject  S  at  point  P  by  Ss,p-  Our  approach  in  assigning  these 
sample  spaces  is  discussed  by  Halpern  and  Tuttle,  where  they  describe  it  as  “correspond [ing] 
to  what  decision  theorists  would  call  an  agent’s  posterior  probability”  [22,  §6].  In  particular, 
we  choose  Ss,p  to  be  the  set  of  points  within  tree(P)  that  have  the  same  history  of  inputs 
and  outputs  on  channels  in  S  as  occur  on  the  path  to  point  P.  Essentially,  this  means  that 
S’s  probability  space  takes  into  account  all  inputs  and  outputs  that  S  has  seen  up  to  the 
current  point;  S  does  not  forget  anything  it  has  seen.  More  precisely,  we  have  the  following 
definitions. 

Definition  4.5  Let  S  C  C  be  a  subject  and  let  pi  =  (aq,/3i,7i)  and  p2  =  ( •  /L •  72 )  be 
two  runs  (not  necessarily  in  the  same  tree).  We  say  that  pi  and  p2  have  the  same  S -history 
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up  to  time  A:  if  and  only  if6 


Vi,  1  <  i  <  k,  Vc  G  5,  a'(c,  i)  =  ct(c,  i)  A  /i'(c,  i)  =  /3(c,  i) 


□ 


Definition  4.6  Let  5  C  C  be  a  subject  and  let  Pi  =  (pi,A’i)  and  P2  =  (p2,A;2)  be  two 
points  (not  necessarily  in  the  same  tree).  We  say  that  Pi  and  P2  have  the  same  S -history  if 
and  only  if  the  following  two  conditions  hold. 

1.  A'i  =  A:2. 

2.  pi  and  p2  have  the  same  5-history  up  to  time  A'i. 


□ 


Definition  4.7  Since  points  are  unique  even  across  trees,  for  a  given  point  P,  there  is  no 
ambiguity  in  referring  to  “the  tree  that  contains  P”.  In  the  following,  we  will  use  tree(P)  to 
denote  that  tree.  □ 

Definition  4.8  Let  S  C  C  be  a  subject  and  P  be  a  point;  the  sample  space  for  S  at  point 
P  is  given  by 

Ss,p  =  {  P7  |  tree(P')  =  tree(P)  A  P'  and  P  have  the  same  5-history  } 


□ 


Now,  for  a  given  point  P,  we  will  assign  truth  values  to  temporal  formulae  <p  at  that  point. 
In  addition,  we  assign  values  to  variables,  for  example  the  input  on  a  channel,  at  that  point. 
The  assignment  function  that  does  both  of  these  is  denoted  by  vP. 

To  define  vP,  we  will  need  to  assign  truth  values  to  formulae  containing  primed  variables. 
Therefore  we  will  also  define  functions  i\pup.2)  (where  Pi  and  P2  are  points  and  we  think 

6In  other  settings,  we  might  also  consider  the  possibility  that  a  subject  S  has  internal  state  variables  and 
could  use  these  to  make  finer  distinctions  between  points.  However,  in  our  application,  all  of  the  internal 
processing  of  the  relevant  subjects  (viz,  7 i  and  C)  is  encoded  in  the  adversary  and  is  thus  factored  out  of 
the  computation  tree.  We  therefore  do  not  lose  any  needed  generality  in  making  this  definition. 
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of  P2  as  being  a  child  of  Pi  in  some  tree)  to  assign  truth  values  to  formulae  over  a  pair  of 
points. 

We  define  vp  and  /•</>,./.>.,)  mutually  recursively  below.  First  we  present  some  additional 
notation. 

Notation  Since  there  is  a  one-to-one  correspondence  from  trees  to  adversaries,  we  can  refer 
to  “the  adversary  corresponding  to  tree(P)n .  We  denote  that  adversary  by  A(P). 

We  use  the  notation  succ(P)  to  denote  the  set  of  nodes  that  immediately  succeed  P  in 
tree(P)  (i.e. ,  the  children  of  P ). 

We  use  the  notation  extensions (P)  to  denote  the  set  of  infinite  sequences  of  states  starting 
at  P  in  tree(P).  □ 

We  now  define  vp  and  «(p1,p2).  Let  P  be  a  point  at  time  k  in  the  execution  p  =  (a,  /3,  7)  in 
computation  tree  T 4. 

•  Numbers  are  assigned  to  number  names. 

•  Members  of  /  and  O  are  assigned  to  their  syntactic  identifiers. 

•  For  any  channel  cGC, 

vp{cm)  =  a(c,k) 

•  For  any  channel  cGC, 

Vp{cout )  =  p(c,  k) 

•  For  any  variable  name,  X,  excluding  channel  variables  (such  as  or  cout ) 

vP(X)  =  7(X,&) 

•  Members  of  M,  /,  and  O  are  assigned  values  at  a  pair  of  points  by  referring  to  their 
values  in  the  first  of  the  points,  e.g., 

*(p„p,)(0.5)  =  i;pI(0.o) 


In  contrast,  variables  may  change  their  value  from  one  point  to  the  next,  so  unprimed 
variables  are  evaluated  by  referring  to  the  first  point,  e.g.,  for  a  state  variable  X, 

V(Pi,p2)(-Y)  =  ( -V  ) 
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whereas  primed  variables  are  evaluated  by  referring  to  the  second  point,  e.g., 


=  vn(X) 


•  Composite  terms  are  assigned  values  at  a  pair  of  points  by  evaluating  the  constituent 
parts  at  the  same  pair  of  points  and  applying  the  corresponding  semantic  operator, 

e-g., 

)  =  V(Pl,P2)(X')  +  V(Pl,P2)0:'  ) 

•  Similary,  nonmodal  formulae  are  assigned  truth  values  at  a  pair  of  points  by  evaluating 
the  constituent  parts  at  the  same  pair  of  points,  e.g., 

V{i\,P2){X  <  Y)  =  true  iff  v{Pup2){X)  <  v(jPi,p2)(F) 

and 


A  i')  =  true  iff  v(pi.,P2){'P)  =  true  and  =  true 

•  To  interpret  the  probability  of  a  nonmodal  formula  p  at  a  point  P,  we  will  take  the  set 
of  all  pairs  of  points,  (Pi,  P2)  where  Pi  is  in  r  and  P2  emanates  from  Pi.  Restricting 
to  this  set,  we  compute  the  probability  of  those  pairs  such  that  r,(p1,p2)(v?)  evaluates  to 
true.  More  precisely,  for  any  nonmodal  formula,  ip,  and  for  any  subject  S  C  C. 

t/-(/V<,'(r  ))  =  blA(p){^s,p{ifi)  |  $s,p) 

where 

Ss.r(p)  =  {P2  |  3Pi  G  *S,s,p  such  that  P2  G  succ(P\ )  and  />,  j\)(p)  =  true  } 

•  An  atomic  formula,  <p,  is  true  at  a  point,  P,  iff  it  is  true  for  all  pairs  of  points  emanating 
from  P.  More  precisely, 

Vp(ip)  =  true  iff  VP7  G  succ(P ),  C(p,p')(v?)  =  true 

(Since  we  have  not  needed  to  include  quantification  in  our  language  we  are  free  to  use 
‘V’  and  ‘3’  as  metalinguistic  shorthand.) 

•  For  any  formula,  p. 

vP( □(/?)  =  true  iff  Vp  G  extensions (P),\/iv^i')(ip)  =  true 
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•  Composite  formulae  are  assigned  truth  values  at  points  in  the  natural  way.  For  exam¬ 
ple, 

vP(ip  A  t/’)  =  true  iff  vP(p)  =  true  and  vP(if))  =  true 

•  Our  knowledge  operator  is  an  S5  modal  operator  and  is  given  semantics  in  terms  of 
the  accessibility  relation  (on  points)  in  the  standard  way;  viz,  for  any  two  points,  Pi 
and  P2  (not  necessarily  in  distinct  trees)  and  any  subject,  S  C  C,  we  say  that  P2  is 
accessible  from  P\ .  denoted  1 Kg  ( P| .  P2 ) :  if  and  only  if  P\  and  P2  have  the  same  S- 
history;  further,  we  use  these  accessibility  relations  to  assign  truth  values  to  formulae 
of  the  form  Kg{p)  as  follows. 

vP(Ks(p))  =  true  iff  VP',  Kg(P,P')  implies  vPi(p)  =  true 

In  the  remainder  of  the  paper,  for  a  model  M  =  (M,  +,-,<,  IF,  T,  C.  /,  O ,  v,  hq, . . . ,  k\p(c)\  )■> 
formula  p,  and  set  of  formulae  ,  ,  we  will  use  lM  \=  ip'  to  indicate  that  p  evaluates  to  true 
at  the  roots  of  all  trees  in  T  and  M  \=  ,  to  indicate  that  all  members  of  ,  evaluate  to  true 
at  the  roots  of  all  trees  in  T. '  Finally,  we  will  use  ‘,  \=  p'  to  indicate  that  M  \=  ,  implies 
M  \=  p  for  every  model  M. 


5  Soundness 


In  §6  and  §7  below  we  give  two  syntactic  characterizations  of  security  and  show  that  the  se¬ 
mantic  interpretations  of  our  syntactic  characterizations  are  equivalent  to  certain  previously 
developed  definitions.  However,  the  significance  of  these  results  is  greatly  reduced  unless 
the  logic  is  sound.  For,  without  soundness  there  is  no  guarantee  that  any  formal  proof  of 
security  implies  any  independently  motivated  notion  of  security.  A  soundness  theorem  gives 
us  just  such  a  correspondence. 


Theorem  5.1  [Soundness]  Given  a  set  of  formulae  of  our  language  ,  and  a  formula  p , 

If  ,  h  p,  then  ,  |=  p. 


□ 

7 Typically;  semantics  for  modal  logics  treat  truth  in  a  model  as  truth  in  all  possible  worlds  in  that  model. 
Those  more  familiar  with  this  usage  than  with  ours  should  note  that  on  a  computational  view  the  primary 
notion  is  that  of  a  run  rather  than  a  world  (state).  Thus,  truth  in  a  model  is  more  naturally  thought  of  as 
truth  in  all  runs  in  that  model  (hence,  at  the  initial  state  of  all  runs). 
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Proof:  In  order  to  prove  soundness  we  must  show  that  the  axioms  are  valid  and  the  rules 
are  truth  preserving  (except  Nec  which  need  only  be  theorem  preserving).  For  most  of  the 
axioms  and  all  of  the  rules  the  results  are  completely  standard.  (Cf.  Chellas  [7]  and  Goldblatt 
[16].)  Hence,  we  do  not  set  them  out  here.  We  specifically  assumed  a  semantics  in  which 
all  the  rules  and  axioms  concerning  logical  connectives  preserve  soundness.  Since  we  assume 
the  real  numbers  are  part  of  our  models,  the  axioms  concerning  them  must  all  be  valid. 
Likewise,  because  the  Pr(p )  terms  are  interpreted  as  conditional  probabilities  of  events, 
the  RV  axioms  are  valid  in  our  semantics  since  they  reflect  basic  facts  about  probability 
measures.  The  accessibility  relations,  set  out  above  in  §4,  are  clearly  equivalence  relations. 
Thus,  by  a  standard  result  of  modal  logic,  the  S5  axioms  are  all  valid  and  Nec  (for  the 
knowledge  operators)  is  theorem  preserving  (cf.  [7]).  The  temporal  reasoning  axioms  are 
similarly  valid  and  Nec  for  the  temporal  operator  is  theorem  preserving  based  on  the  time 
structure  of  our  model  of  computation  (cf.  [16]). 

All  that  remains  is  to  show  the  soundness  of  our  two  additional  axioms.  To  show  the  validity 
of  Km,  let  P\  be  a  point  where 

'•/■,(:  V<sy))  =  false 

Then,  by  our  definition  of  the  semantic  assignment  function,  there  exist  P2,  />2-  and  k2  such 
that  the  following  three  conditions  hold. 

KsiPufy)  (2) 

p2  €  extensions  (P2)  (3) 

=  false  (4) 

Now,  (p2,  k2)  is  a  point  in  an  extension  of  P2.  Hence,  Equation  4  implies 

vp2(Oip)  =  false  (5) 

which,  along  with  Formula  2,  implies 

vPl{Ks{np))  =  false 

and  Km  is  valid. 

To  show  the  validity  of  KPr,  we’ll  assume  P\  is  a  point  such  that 

vPl{Ks{Prc(ip)))  =  r  (6) 
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and  show  that 


vPl(Ks{Prs(ip)))  =  r  (7) 

Applying  the  semantic  assignment  function  to  Equation  6  implies  that  for  all  P',  Ks(Pi,P') 
implies 

|  Sc,p')  =  r  (8) 

To  show  Equation  7,  let  P2  be  a  point  such  that  Kg  (P|  ,  P2 ) .  By  the  semantic  assignment 
function,  we  have  the  following. 

VftiPrsiv))  =  r')  I  <^S,P2)  (9) 


By  the  definition  of  conditional  probability  and  the  additive  property  of  probability  measures, 
we  can  expand  the  right-hand  side  of  Equation  9  to  get: 


Vp2{Prs{v)) 


Ep'  L1A{p2){Ss,p2{v)  |  Sc,p')Ha(P2){Sc,p') 

I'Al  (Ss,P2 ) 


(10) 


(where  the  summation  is  taken  over  all  P'  such  that  P'  is  in  the  same  tree  and  has  the  same 
5-history  as  P2). 


Limiting  Sg,p2[}p)  to  those  points  emanating  from  5Cy/  results  in  5, yd  r')>  so  Equation  10 
can  be  rewritten  as: 


Vp2{Pri s{ip)) 


Ep'  ^a(p2){^c,p'{^p)  I  Sc,p')Va(p2){Sc,p') 

/U(P2)  {Ss,P2 ) 


(11) 


Since  P2  and  P'  are  in  the  same  tree,  *4(P2)  =  A(P').  Also,  since  Kg ( P\ .  P2),  all  of  the  P'  in 
the  above  equation  have  the  same  5-history  as  P\.  Therefore,  by  Equation  8, 


/  D  (  w  r  E  p>  Va(P 2 )  (  SC,p’  ) 

vp2(Prs{ip))  = - - - ^ - t - 

Pa(p2)  \&s,p2  ) 

which,  again  by  the  additive  property  of  probability  measures,  implie 


(12) 


o  n(Prs(ip))  =  r  (13) 

and  KPr  is  valid,  which  completes  the  proof.  □ 

This  completes  our  discussion  of  the  logic  itself.  In  the  remainder  of  the  paper  we  focus  on 
security  and  applications  of  the  logic  thereto. 
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6  Formal  Definition  of  Security 


In  this  section,  we  give  our  primary  definition  of  security — which  we  call  the  Formal  Se¬ 
curity  Condition  (FSC) — and  show  that  its  meaning  is  equivalent  to  our  own  Probabilistic 
Noninterference  (PNI)  [17],  which  is  itself  equivalent  to  Browne’s  independently-developed 
Stochastic  Non-Interference  [3].  As  described  in  [17],  PNI  is  motivated  by  previous  work  on 
Noninterference  by  Goguen  and  Meseguer  [15]  and  by  connections  to  information  theory.  In 
particular,  when  the  system  is  modeled  as  a  two-way  channel  with  memory  ([38])  PNI  im¬ 
plies  that  there  is  no  information  flow  over  the  channel  from  the  covert  senders  to  the  covert 
receivers  ([17]).  (See  Browne’s  [4]  for  other  connections  to  classical  information  theory.) 

In  contrast,  the  intuition  for  our  definition  of  security  in  the  present  paper  derives  from  an 
understanding  of  what  the  low  subject  knows  when  using  a  secure  system  versus  an  insecure 
system.  The  intuition  for  our  definition  is  as  follows. 

The  system  under  consideration  is  “secure”  iff  before  the  low  subject  receives  any 
given  output  ( b )  with  any  given  probability  (/’),  it  will  already  know  that  b  is 
about  to  occur  with  probability  r. 

In  essence,  since  the  low  subject  already  knows  the  probability  distribution  over  its  upcoming 
outputs,  it  cannot  learn  any  new  information  when  it  actually  receives  those  outputs.  To 
make  this  precise,  we  introduce  the  following  shorthand. 

Notation  Recall  that  a  subject  is  formalized  as  a  subset  of  C,  the  set  of  S’s  communication 
channels;  this  subset  represents  the  subject’s  view  of  the  system.  For  a  given  subject  L  = 
{Zi ,  .  ln}  C  C  we  often  require  a  formula  specifying  what  L  receives  as  output  at  the 

next  time.  This  can  be  specified  as 

(^1  )out  =  b\  A  {Oout  =  A  .  .  .  A  ( In)  out  =  bn 

(where  hi  G  O  for  1  <  i  <  n ).  We  will  specify  this  more  compactly  as 

L'out  =  bL 

(where  L  C  C  is  the  subject  equal  to  {Zi ,  /2, . . .  ln}  and  bL  G  0[L\  is  the  output  vector  equal 

to  [6i,62,  •  •  ■  A])-  n 

We  now  define  the  Formal  Security  Condition  as  follows. 
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Definition  6.1  Let  L  C  C  be  a  subject.  Suppose  a  system  E  is  described  by  a  set  of 
formulae  in  our  logic,  ,  .  We  say  that  ,  satisfies  the  Formal  Security  Condition  (FSC)  with 
respect  to  L  if  and  only  if,  for  every  bL  G  0[L\,  the  formula 

n(PrL(L'out  =  bL)  =  r  -4  KL(PrL(L'ou,  =  bL )  =  r)) 

is  derivable  from  ,  .  □ 

Note  that  this  definition  refers  only  to  L’s  next  output.  Nevertheless,  we  will  see  below,  in 
Theorem  6.9,  that  this  is  sufficient  to  insure  that  high  behavior  has  no  effect  on  any  L-events, 
including  all  future  outputs  visible  to  L. 

At  first  glance,  this  property  may  appear  too  strong  to  be  satisfied  by  useful  systems.  In 
particular,  the  reader  may  wonder: 

if  users  know  the  probability  distribution  over  their  outputs  before  they  get  them, 
why  would  they  bother  to  use  the  system  at  all?  After  all,  they  won’t  learn 
anything  by  using  it. 

To  see  why  this  is  not  a  concern,  we  need  to  keep  in  mind  that  the  low  subject  L  represents 
not  a  single  user,  but  rather,  the  entire  low  environment.  For  example,  suppose  the  system 
we  are  analyzing  is  a  two-level  database  containing  unclassified  and  secret  information.  In 
this  case,  L  represents  all  users  and  processes  that  are  operating  at  the  unclassified  level, 
including  the  users  and  processes  involved  in  entering  and  updating  unclassified  data.  Thus, 
an  individual  low  user  may  not,  in  practice,  know  the  answer  to  his  query  before  submitting 
it,  but  in  principle  the  information  is  available  to  him,  since  he  can  (in  principle)  know  the 
entire  history  of  the  low  environment,  including  all  low  inputs. 

On  the  other  hand,  the  reader  may  now  wonder: 

in  what  way  can  a  system  fail  to  satisfy  FSC?  That  is,  in  what  case  does  the  low 
environment  fail  to  know  the  probability  distribution  on  its  next  output? 

The  answer  is:  in  precisely  those  cases  where  that  probability  distribution  is  affected  by 
the  high  environment.  That  is,  if  the  high  environment  can  influence  the  probability  with 
which  the  low  environment  gets  certain  outputs,  then  the  low  environment  will  not  know 
that  probability  distribution  (except,  e.g.,  by  statistical  inference  after  it  has  received  those 
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outputs).  Further,  from  information  theory  we  know  this  is  precisely  the  situation  in  which 
the  high  environment  can  send  information  to  the  low  environment,  i.e. ,  this  is  the  situation 
in  which  the  system  has  a  covert  channel. 

Now  we  would  like  to  show  that  FSC  is  equivalent  to  PNI.  To  do  so,  we  will  talk  about  the 
“meaning”  of  FSC,  or  more  precisely,  the  semantic  interpretation  of  FSC,  which  we  define 
as  follows. 

Definition  6.2  We  say  that  ,  satisfies  the  Semantic  Interpretation  of  FSC  with  respect  to 
L  if  and  only  if,  for  every  bL  G  0[L\, 

.  1=  0[PrL{L'M  =  bL)  =  r  ->  KL[PrL(L'mt  =  bL)  =  r» 


□ 

To  prove  the  semantic  interpretation  of  FSC  is  equivalent  to  PNI,  we  also  need  to  recast  the 
latter  in  terms  of  our  model.  We  do  this  as  follows. 

Definition  6.3  Let  A\  and  A2  be  two  adversaries  that  satisfy  the  Secure  Environment 
Assumption.  We  will  say  that  A\  and  A>  agree  on  L  behavior  iff  there  exist  Hi,  H2, 
and  C  such  that  Hi  and  C  are  the  unique  probability  functions  that  describe  A\  (as  in 
Definition  2.2)  and  FL2  and  C  are  the  unique  probability  functions  that  describe  A.2-  □ 

Definition  6.4  Let  S  C  C  be  a  subject  and  let  e  be  a  set  of  runs,  {pi},  (not  necessarily 
taken  from  any  one  computation  tree).  We  say  that  e  is  an  S- event  if  and  only  if  there  exists 
a  time  A:  G  N+  such  that  for  any  two  runs,  p\  and  p2,  having  the  same  S-hi story  up  to  time 
A:,  pi  G  e  iff  p2  G  e. 

For  an  S-event ,  e,  we  will  refer  to  the  least  A:  such  that  above  condition  holds  as  the  length 
of  e..  □ 

Intuitively,  e  is  an  S-event  if  and  only  if  there  is  some  finite  time  k  (i.e.,  its  length)  after 
which  S  can  always  determine  whether  or  not  e  has  occurred. 

Note  that  in  general,  an  S-event  contains  runs  from  more  than  one  computation  tree.  There¬ 
fore,  such  “events”  will  not  be  measurable  in  any  of  our  probability  spaces.  Rather,  we  think 
of  them  as  meta  events  and  we  will  be  interested  in  the  measure  of  the  subset  of  the  runs  that 
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are  contained  in  a  given  computation  tree.  To  make  this  precise,  we  introduce  the  following- 
definition. 


Definition  6.5  Given  a  computation  tree,  X 4,  and  an  5-event,  e,  the  projection  of  e  onto 
Ta,  denoted  e^,  is  given  by: 

e-A  =  runs(Tj[ )  n  e 


□ 


When  it  is  clear  from  context  what  is  meant,  we  ignore  the  distinction  between  meta-events 
and  their  projections,  eg.,  we  write  ‘/54(e)’  for  Aa^aV- 

Observation  6.6  Every  projection  of  every  5-event  is  measurable.  That  is,  for  any  5-event, 
e,  and  any  computation  tree,  T4, 

&A  G  Xa 

This  is  due  to  the  restriction  on  5-events  that  they  be  observable  within  some  finite  time. 
In  particular,  the  projection  of  an  5-event  onto  a  tree,  T,  must  also  be  observable  within  a 
finite  time,  and  so  it  must  be  formable  from  a  finite  number  of  unions  and  complementations 
of  the  generators  of  T .  □ 

Definition  6.7  Let  E  be  a  system  with  computation  trees  T(S).  We  say  that  E  satisfies 
Probabilistic  Noninterference  (PNI)  with  respect  to  a  subject  L  C  C  iff  for  any  two  trees 
satisfying  the  Secure  Environment  Assumption,  Ta,Ta'  G  T(S)  and  any  L-event,  e,  if  A 
and  A!  agree  on  L  behavior,  then 

/L4(e)  =  tU'(e) 

□ 


Now  we  are  in  a  position  to  state  the  main  theorem  of  this  section.  Before  doing  so,  we  state 
and  prove  a  lemma. 

Lemma  6.8  If  T4  and  T4/  are  two  trees  such  that  A  and  A!  agree  on  L  behavior  (and  satisfy 
the  Secure  Environment  Assumption)  then  the  following  two  conditions  are  equivalent. 
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1.  For  any  low  output  vector,  bL  G  0[L\,  and  any  two  points  PieTA  and  P2  e  TA’  such 
that  kS'(P|.P2). 

/Gi(/,,)(‘5/./'l(/-oU/  =  *l)  I  *5l,Pi)  =  ^A(p2)(SL,P2(L'out  =  bL)  |  Sl,p2) 

2.  For  any  L-event,  e, 

AU(<W)  =  ^A'ieA') 


□ 

Proof:  We  begin  by  observing  that  SL^Pl(L'out  =  6L)  and  SL^p2(L'out  =  bL )  are  projections  of 
the  same  L-event  and  that  5^  and  <SLip2  are  projections  of  another  L-event.  Therefore  the 
backward  direction  of  the  lemma  (i.e. ,  that  condition  2  implies  condition  1)  follows  easily.  In 
particular,  let  Pi  e  ta  and  P2  e  TM  be  two  points  such  that  Ks(Pu  P2).  Then  condition  2 
implies 

/L  l;  /', )  (^>/  ./>,  )  =  V-A{P2)(Sl,P2) 

and  further — since  the  intersection  of  two  L-events  is  again  an  L-event — that 

tiA(p,){SL,pAL'out  =  b l )  n  <SL,pJ  =  n A(p2)(^L,p2(L'out  =  bp)  n  SL2p2) 

Therefore  condition  1  holds  by  the  definition  of  conditional  probability. 

To  prove  the  forward  part  of  the  lemma,  we  start  by  showing  that  it  holds  for  a  certain 
subset  of  L-events,  namely  those  L-events  corresponding  to  finite  L-hi stories. 

Let  e  be  an  L-event  such  that  there  exists  a  time,  A:,  (the  length  of  e)  and  a  characteristic 
run,  p ,  such  that  for  any  run,  //. 

p'  G  e  iff  p'  has  the  same  L-history  as  p  up  to  time  A: 

That  is,  e  corresponds  to  the  finite  L-history  characterized  by  p  up  to  time  A:. 

We  now  prove  the  forward  part  of  the  lemma  for  this  subclass  of  L-events  by  induction  on 
the  length  of  e. 

Base  case:  The  length  of  e  is  zero. 

Since  all  runs  have  the  same  L-history  up  to  time  0,  the  only  two  L-events  of  length  0  are 
the  empty  set,  0,  and  the  set  of  all  runs  from  all  trees,  71.  In  the  former  case, 

A  u(0yt)  =  0  =  pA'($A') 
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and  in  the  latter  case, 


Pa{^a)  =  1  =  fJ'A'i'R'A') 


Thus,  the  base  case  is  proved. 

Induction  case:  Assume  condition  2  holds  for  all  L-events  (corresponding  to  finite  L- 
histories)  of  length  k.  Let  e  be  an  L-event  corresponding  to  a  finite  L- hi story  of  length  k  + 1 . 
Suppose  that  p  is  a  run  that  (up  to  time  A:  +  1)  characterizes  e. 

Now,  let  e  be  the  L-event  characterized  by  p  up  to  time  k.  Intuitively,  e  corresponds  to  the 
finite  L-history  obtained  by  truncating  e  at  time  k.  By  the  induction  hypothesis, 

Pa^a)  =  PA'{£ v)  (14) 

If  Pa(?a)  =  0?  then  ^(e)  =  0  =  PA'ie)  and  the  induction  case  holds  trivially,  so  we  assume 
Pa^a)  >  °- 

By  Equation  14,  we  also  have  that  Pa'^Aa)  >  0-  Thus,  by  the  definition  of  conditional 
probability, 

Pa(<  )  =  AU(e)  •  Pa(<  |  e)  (15) 

and 

PA'  (e)  =  PA'  (e)  ■  PA'  (e  |  e)  (16) 

Let  a  £  ILa  and  f3  £  0L^  be  the  low  input  and  output  history,  resp.,  that  characterize  e 
and  let  ciL  £  I[L\  and  bL  £  0[L\  be  the  low  input  and  output  vectors  at  time  A:  +  1  that  are 
needed  to  additionally  characterize  e.  Then,  by  the  construction  of  our  probability  measures 
(as  described  in  §2.2)  and  by  the  Secure  Environment  Assumption,  we  have  that 


PA{e 

1  e)  =  PAifo,  1  e) 

■  C(aL 

a,/3) 

(17) 

PA'  (e 

e)  =  pa'  ( bL  |  e) 

■  £  {a L 

1 

(18) 

where  is  the  meta-event  representing  that  the  low  output  vector  at  time  k  +  1  is  b  and  C 
and  C  are  the  low  environments  of  A  and  A! ,  respectively. 

Since  A  and  A!  agree  on  L  behavior, 

C(aL  |  a,  p)  =  C'(aL  \  a,  /3)  (19) 
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Further,  since  //.((c)  and  tlA'ie)  are  both  greater  than  zero,  there  are  points  in  both  trees, 
P\  e  ta  and  P2  e  Ta1,  each  of  whose  L-histories  are  (o,  j3).  By  condition  1, 

»A(pl){SL,pl{L,mit  =  h)  I  SL,Pi)  =  ^A(P2){^L,p2(L'out  b,)  SL,p2) 

But  notice  that  Sl,Pi  and  <Sl,p2  are  projections  of  e  and  SLPl  ( L'out  =  bL )  and  SL^p2(L'0Ut  =  bL) 
are  projections  of  bL.  Therefore, 

VA(bL\e)  =  HA>{bL\e)  (20) 

Thus,  by  Equations  17,  18,  19,  and  20,  we  have  that 

iu(e  |  e)  =  HA'fa  |  e)  (21) 

and  finally,  by  Equations  14,  15,  16,  and  21,  we  have  that 

!U(e)  =  AU'(e) 


and  the  induction  case  is  proved. 

Now,  we  can  complete  the  proof  by  observing  that  every  /.-event  can  be  constructed  by 
taking  a  finite  number  of  unions  and  complementations  of  L-events  that  correspond  to  finite 
L-liistories.  That  is,  the  L-events  that  correspond  to  finite  L-histories  are  analogous  to  the 
generators  of  our  event  spaces.  Thus,  the  desired  result  that  //^(e)  =  n A'(e )  for  arbitrary  L- 
events  follows  from  the  fact  that  the  measures  are  equal  on  all  of  the  L-events  that  correspond 
to  finite  L-histories.  □ 

We  can  now  prove  the  following  theorem  relating  PNI  and  FSC. 

Theorem  6.9  Let  ,  be  a  set  of  formulae  describing  E  and  let  L  C  C  be  a  subject.  Then,  E 
satisfies  PNI  with  respect  to  L  iff  ,  satisfies  the  semantic  interpretation  of  FSC  with  respect 
to  L.  □ 

Proof:  Let  M  =  (M,  +,  •,  <,  W,  T .  C.  /,  O,  v,  />•  i . . . . .  k  -/■><(••>  )  be  a  model  such  that  M  \=  ,  . 

,  satisfies  the  semantic  interpretation  of  FSC  (wrt  L)  iff  for  every  £  0[L], 

M  h  D(PrL(L'cut  =bL)=r  ^  KL(PrL(L'mt  =bL)  =  r)) 
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which  holds  iff  for  every  bL  G  0[L\  and  every  root,  P,  of  any  tree  in  T, 


vp{D(PrL{L’out  =  bL)  =  r  ->•  KL(PrL(L'0Ut  =  bL)  =  r)))  =  true 

which,  by  applying  the  semantic  assignment  function,  holds  iff  for  every  bL  G  0[L\  and  every 
two  points  Pi,P2  G  11  such  that  /\p(Pi,P2), 

vPl{PrL{L'0Ut  =  bL)  =  r)  =  true  implies  vP2  (PrL{L'out  =  bL)  =  r)  =  true 

which  holds  iff  for  every  G  0[L\  and  every  two  points  Pi,P2  G  11'  such  that  K/,(P|  ,  P2), 

VpdPr  L{L'0Ut  =  b l ))  =  vp2{Pr  L{L'0Ut  =  bL)) 

which,  again  by  the  semantic  assignment  function,  holds  iff  for  every  bp  G  0[L\  and  every 
two  points  Pi,P2  G  11'  such  that  />:/.( P| .  / 2 ); 

^AiPi^L^iL'ont  =  bp)  |  (*5/, /',(/-',»/  =  bL)  |  <5Lip2)  (22) 

It  is  therefore  sufficient  to  show  that  S  satisfies  PNI  iff  Equation  22  holds.  That  Equation  22 
implies  PNI  follows  easily  from  Lemma  6.8.  In  particular,  let  A  and  A!  be  two  adversaries 
that  agree  on  L  behavior.  From  Equation  22,  Lemma  6.8  implies  that  for  any  L-evenf  e, 

AU(e)  =  fU'(e) 

Hence,  E  satisfies  PNI. 

To  show  the  reverse  direction,  assume  E  satisfies  PNI,  let  bL  G  0[L\  be  arbitrary,  and  let 
Pi,P2  G  W  be  two  points  such  that  Ki,{P\  ■  P>)- 

Note  that  we  cannot  apply  PNI  immediately,  since  it  may  be  the  case  that  -A(Pi)  and  *4(P2) 
do  not  agree  on  L  behavior.  For  this  reason,  for  each  point  P,  we  construct  a  new  adversary 
A(P)  as  follows.  (Note  that  A(P)  is  an  adversary  corresponding  to  a  tree  that  does  not,  in 
general,  contain  P.) 

Suppose  P  =  (a,f3,k)  and  A{P)  =  (%,£).  Then  A(P)  =  (P,£),  where  C  is  defined  as 
follows. 

For  k!  <  k,  a'  G  II# ,  and  p1  G  Ol^, 


£{a  \L  |  a',b')  = 


h: 

0, 
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if  a  \  L  =  ( a ,  k')  \  L\ 
otherwise. 


For  k1  >  A:,  a'  G  /M'.  and  .i’  G  0L 


1,  if  a  |  L  =  a0; 
0,  otherwise. 


£(a  r  T  |  a',  6')  = 

(where  a0  is  some  constant  input  on  channels  in  L).  That  is,  C  blindly  and  deterministically 
follows  a  \  L  up  to  time  k  and  then  blindly  and  deterministically  inputs  a0  from  then  on. 

Now,  note  that  according  to  our  construction  of  the  computation  trees,  there  is  a  point 
P  =  (a,  fi.  k )  in  a  given  tree  T  if  and  only  if  at  each  point  leading  up  to  P,  say  P'  =  (a,  (3,  k') 
{k!  <  A:),  the  following  three  conditions  all  hold. 

0{(3{k'  +  1)  |  a  r  k\(3  \  k')  >  0  (23) 

n{a{k'  +  l)  \  (C  —  L)  \  a  \k'j3  \  k')  >0  (24) 

C{a{k'  +  1)  1  L  |  a  \  k'  \  L,(3  \  k'  \  L)  >  0  (25) 

(where  a  \  k'  and  / 3  \  k'  are  the  restrictions  of  a  and  f3  to  Ic,k'  and  Oc,k'  ■  respectively. 

Further,  since  in  constructing  T-j^  we  have  retained  the  output  probability  function  O  and 
high  behavior  %  used  in  TA(p)  and  chosen  C  to  ensure  Equation  25  holds  appropriately,  there 
is  a  point  P  e  TA(p )  that  has  the  identical  I/O  history  as  P  G  T^Py  Further,  for  any  point 
P'  G  Ta(p)  having  the  same  L-lii story  as  P,  there  is  a  corresponding  point  P'  G  TA(P)  with 
the  same  I/O  history  as  P' . 

Finally,  note  that  by  our  construction  of  the  computation  trees,  since  P'  and  P'  have  the 
same  I/O  history,  say  (cd,  i3'),  the  sum  of  the  probabilities  on  arcs  emanating  from  P\  where 
L'out  =  t>L  is  equal  to  the  sum  of  the  probabilities  on  arcs  emanating  from  P',  where  L'out  =  bP- 
In  particular,  they  are  both  equal  to 

£  0(b\a',P') 

b  |  L=bL 

Since  this  is  the  case  for  all  points  in  TA<yP)  having  the  same  L- hi  story  as  P,  we  have  that 

31A(P)(Sl,p {L'out  =  ^>l)  |  Sl,p)  =  A lA(P)(SL^(L'out  =^l)  \  <SL%P)  (26) 

In  particular,  Equation  26  holds  for  both  1\  and  P2. 
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Now  note  that  A{P\)  and  A(P>)  agree  on  L  behavior.  Therefore,  since  SL^P(L'0Ut  =  bL)CSL%P 
and  S/  j>  are  L-events,  PNI  implies 

tlA(:Pi)(SLjy(Lout  =  ^l)  |  *5/  jy)  =  tl'A(P2)(^L,P^(^out  =  ^l)  I  <S LP j)  (27) 

Finally,  Equations  26  and  27  imply  Equation  22,  which  completes  the  proof.  □ 

The  significance  of  this  theorem  is  that  (given  soundness)  verifying  that  a  system  satisfies 
FSC  is  sufficient  to  show  that  it  satisfies  PNI,  which  (as  was  previously  mentioned)  is  a 
necessary  and  sufficient  condition  for  a  system  to  be  free  of  covert  channels.  In  the  next 
section,  we  discuss  the  issue  of  verifying  FSC. 

7  Verification 

7.1  Syntactic  Statement 

In  [30],  McLean  defines  the  Flow  Model  (FM)  with  the  motivation  of  providing  an  abstract, 
but  precise,  explication  of  information  flow  security.  McLean’s  intent  for  FM  is  to  provide  a 
characterization  of  security  against  which  more  concrete  security  models  can  be  evaluated. 
In  [17],  the  first  of  the  present  authors  studies  a  more  concrete  version  of  FM,  called  the 
Applied  Flow  Model  (AFM),  and  shows  that  AFM  captures  a  strictly  stronger  notion  of 
security  than  PNI. 

In  this  paper,  we  have  another  reason  for  studying  AFM:  it  is  more  easily  verified  than  FSC. 

Definition  7.1  Let  L  C  C  be  a  subject.  Suppose  ,  is  a  set  of  premises  that  describe  a 
system  E.  We  say  that  ,  satisfies  the  Syntactic  Verification  Condition  (SVC)  with  respect 
to  L  if  and  only  if,  for  every  bp  €  0[L\,  the  formula 

°Wrc(L'M  =  h)  =  r  ->  KL(Prc(L'„,  =  bL)  =  r)) 

is  derivable  from  ,  .  □ 

Intuitively,  SVC  says  that  at  all  times,  the  low  environment  knows  the  objective  probability 
distribution  on  its  next  output. 

In  the  next  section,  we  show  this  statement  is  equivalent  to  a  statement  about  conditional 
statistical  independence.  Namely,  conditioned  on  the  previous  L-hi story,  the  next  output  on 
L  is  statistically  independent  of  the  previous  non-L  (i.e. ,  high)  history. 
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7.2  Relationship  to  AFM 


In  this  section  we  show  that  ,  |=  SVC  if  and  only  if  the  system  specified  by  ,  satisfies  AFM 
(i.e. ,  the  relationship  between  SVC  and  AFM  is  analogous  to  the  relationship  between  FSC 
and  PM). 

Definition  7.2  Let  S  be  a  system  with  computation  trees  T(S)  and  let  L  C  C  be  a  subject. 
We  will  say  that  S  satisfies  the  Applied  Flow  Model  (AFM)  with  respect  to  L  iff  for  any  tree, 
TA  G  T(S)  (satisfying  the  Secure  Environment  Assumption  with  respect  to  L),  any  point 
PgT 4,  and  any  low  output  vector,  b p  G  0[L\, 

A LA{Sc,p{L'0Ut  =  bL )  |  Sc,p )  =  /J,A(SL^P(L'0Ut  =  bL)  \  SL^P)  (28) 

□ 

This  definition  is,  except  for  notational  differences,  exactly  the  definition  of  AFM  as  given 
in  [17].  Now  we  can  prove  the  following  theorem. 

Theorem  7.3  Let  ,  be  a  set  of  formulae  describing  S  and  let  LCCbea  subject.  Then, 
S  satisfies  AFM  with  respect  to  L  iff  ,  satisfies  the  semantic  interpretation  of  SVC  with 
respect  to  L.  □ 

Proof:  Let  M  =  (M,  +,  •,  <,  W,  T,  C.  /,  O,  v,  /r  1 . . , . .  Kyp(c)\  )  be  a  model  such  that  M  \=  ,  . 

,  satisfies  the  semantic  interpretation  of  SVC  (wrt  L)  iff  for  every  G  0[L\, 

M  h  □  (Prc(L'mt  =  bL)  =  r  KL(Prc(L'„,  =bL)=  r)) 

which  holds  iff  for  every  bL  G  0[L\  and  every  root,  P,  of  any  tree  in  T, 

vp{U(Prc{L’out  =  bL)  =  r  ->•  KL{Prc{L'0Ut  =  bL)  =  r)))  =  true 

which,  by  applying  the  semantic  assignment  function,  holds  iff  for  every  G  0[L\  and  every 
two  points  Pi,P2  G  W  such  that  Kl(Pi,  P2), 

VPi{Prc{L’out  =  bL )  =  r)  =  true  implies  vP.2{Prc{L'0Ut  =  bL)  =  r)  =  true 

which  holds  iff  for  every  bL  G  0[L\  and  every  two  points  P\ .  I\  G  W  such  that  kl(Pi,P2), 

vPl{Prc{L'0Ut  b,))  ■-  vp2{Prc{L'0Ut  =  bL)) 
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which,  again  by  the  semantic  assignment  function,  holds  iff  for  every  bL  G  0[L\  and  every 
two  points  P\ .  P2  G  W  such  that  pl(Pi,P2), 

(l'out  =  M  I  Sc,Pi)  =  LlA(P2)(Sc,p2{L'0Ut  =  b l )  |  Sc,p2 )  (29) 

Now,  to  show  that  the  semantic  interpretation  of  SVC  implies  AFM,  we  assume  Equation  29 
and  show  that  Equation  28  holds.  Consider  the  right-hand  side  of  Equation  28. 

liA^i...p{l-'out  =  h)  |  $l,p) 

By  the  definition  of  conditional  probability  and  the  additive  property  of  probability  measures, 
this  is  equal  to: 

Ep'  A ia(Sl,p(L'ou1.  =  bL)  |  Sc,p’)ha{Sc,p’) 

A ia(Sl,p) 

(where  the  summation  is  taken  over  all  P'  such  that  P'  is  in  the  same  tree  and  has  the  same 
L- hi  story  as  P ). 

Limiting  SL^P(L'out  =  bL)  to  those  points  emanating  from  Sc,p>  results  in  Sc,p'{L'out  =  bL), 
so  the  above  is  equal  to: 

Ep'  A ia{Sc\p' {L'out  =  E)  |  Sc,p')ha{Sc,p') 

A  ia(Sl,p) 

Now,  by  Equation  29,  ^A(Sc,p'{L'out  =  bL)  \  Sc,p>)  =  HA{ScAL'out  =  M  I  ScA  for  all  P' 
having  the  same  C- hi  story  as  P,  so  the  above  is  equal  to 

A u{Sc,p{L'0Ut  =  bL)  |  Sc ,p)  Y.p'  A u(£c,p') 

A  ia{Sl,p) 

which,  again  by  the  additive  property  of  probability  measures,  is  equal  to 

A lA{Sc,p{L'out  =  M  I  Sc\p) 

which  is  precisely  the  left-hand  side  of  Equation  28  and  therefore,  the  system  satisfies  AFM. 

To  show  that  AFM  implies  the  semantic  interpretation  of  SVC,  we  assume  Equation  28,  let 
bL  €  0[L\  be  arbitrary,  and  let  Pi,P2  G  W  be  such  that  Kp(Pi,  P2).  We  want  to  show  that 
Equation  29  holds;  however,  Pi  and  P2  may  not  be  in  the  same  computation  tree,  so  we 
cannot  apply  Equation  28  directly.  We  therefore  define  an  adversary  Ao  such  that  TAo  is 
guaranteed  to  contain  two  points  P[  and  P2  such  that  P[  has  the  same  C'-hi story  as  Pi  and 
P2  has  the  same  C'-hi  story  as  P2. 
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Suppose  A(P\)  =  (Pi,  C\)  and  A(P2)  =  (P2,C2).  We  define  Ao  to  be  (P0,C0),  where  for 
all  bH  e  I[C  —  L\,  a  €  and  f3  G  Oc',b 

Po(bH  |  cp  ft)  =  -Pi (bn  |  /5)  +  -P2(bn  \  <a,  f3) 

and  for  all  6/[C-f],  «G  //^.  and  (3  €  . 

£0(«l  I  «,/?)  =  I  «,/?)  +  \c2(bH  |  a,/?) 

Since  all  arcs  leading  to  Pi  (resp.,  P2)  are  labelled  with  positive  probabilities,  there  will  be 
corresponding  positive-probability  arcs  in  T 40  leading  up  to  a  point  P(  (resp.,  P2)  with  the 
same  C-history. 

Note  that  the  probabilities  of  reaching  Pi  and  P2  will,  in  general,  be  different  than  the 
probabilities  of  reaching  P[  and  P 3,  respectively.  However,  due  to  our  construction  of  the 
computation  trees,  from  any  given  point,  the  conditional  probability  of  receiving  a  particular 
output  at  the  next  time  step  is  determined  solely  by  the  system  (and  not  by  the  adversary). 
Therefore,  we  have  the  following. 

fl aw  }($,', i’y  (Cu,.  =  bL)  |  SG,pi)  =  VA0(Sc\p[(L'out  =  bL)  |  Sc,p[ )  (30) 

LiA(P2){Sc\p2(L'out  =  bp)  |  SC,p2)  =  A0{sc,pr2(L'out  =  bL)  I  SC,P!2)  (31) 

Now,  we  can  apply  Equation  28  to  the  right-hand  sides  of  Equations  30  and  31;  in  particular, 
since  P[  and  P2  have  the  same  L-hi story,  both  right-hand  sides  are  equal  to  Ha0(^l,p[  ( L'out  = 
)  |  (S/,./>| ).  Therefore,  Equations  28,  30,  and  31  imply 

I^A(Pi )  (&c,Pi  (C'out  =  bL )  |  Sc,  Pi)  =  n A(P2)(Sc,p2(L'0Ut  =  bL)  \  Sc,p2 ) 

and  the  proof  is  complete.  □ 

7.3  FSC  versus  SVC 

We  introduced  SVC  by  claiming  it  is  easier  to  formally  verify  than  FSC.  To  see  why,  consider 

the  structure  of  the  formulae  that  need  to  be  derived  in  verifying  FSC,  viz, 

APrdL'„,t  =  bL)  =  r  -2  Kl(Ptl(L'm  =  h)  =  r» 
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The  primary  difficulty  with  deriving  such  a  formula  is  that  it  requires  us  to  reason  about 
L's  subjective  probabilities  (i.e. ,  formulae  of  the  form  PrL(ip),  where  L  ^  C ).  We  expect 
systems  will  typically  be  described  entirely  in  terms  of  objective  probabilities  (i.e.,  where  all 
probability  formulae  are  of  the  form  Prd/f)).  Therefore,  deriving  a  formula  in  the  above  form 
requires  us  to  reason  about  how  various  objective  probabilities  give  rise  to  other  subjective 
probabilities.  This  is  a  topic  we  have  not  pursued  in  any  depth  in  the  present  work.  (In  fact, 
the  reader  may  note  there  is  only  one  axiom  in  the  present  logic  addressing  the  interaction 
between  objective  and  subjective  probabilities,  viz,  KPr.)  However,  our  intuition  is  that 
the  relationship  is  closely  related  to  the  Secure  Environment  Assumption. 

There  are  two  special  cases  of  verifying  FSC  that  are  worth  pointing  out.  First,  in  the  case 
where  we  can  derive 


a(PrUL'ml  =  bL)  =  r  ->  Prc(L'out  =  bL)  =  r)  (32) 

verifying  FSC  reduces  to  the  problem  of  verifying  SVC.  That  is,  if  (as  part  of  verifying  SVC) 
we  have  derived 


n(Prc(L'm,  =  bL)  =  r  ->  KL(Prc(L'M  =  h)  =  r))  (33) 

then  we  can  use  Formulae  32  and  33  in  conjuction  with  Axiom  KPr  to  conclude 

a(PrL(L‘mt  =  bL)  =  r  —>  KL(PrL(L‘mt  =  bL)  =  r)) 

and  thus  prove  FSC.  Of  course,  in  such  cases  we  can  also  simply  verify  SVC  and  avoid  the 
extra  work  of  verifying  Formula  32. 

The  other  special  case  is  when  the  system  behavior  can  be  described  without  any  reference 
to  inputs.  In  this  case,  if  SVC  is  provable,  then  it  will  be  based  on  the  truth  value  of  the 
consequent  of  SVC  without  concern  for  the  antecedent.  Because  of  the  KPr  axiom,  proving 
FSC  (by  showing  the  truth  of  the  consequent)  will  then  be  exactly  as  easy  as  proving  SVC. 

7.4  Examples,  continued 

We  note  here  that  the  security  of  the  encryption  box  of  Example  3.1  with  respect  to  a  subject 
L  C  C  is  formally  derivable  using  SVC.  Recall  the  system  specification:  If  C  =  {hj}, 
I  =  {0, 1},  and  O  =  {0, 1},  then,  the  system  is  specified  by  the  following  formula. 
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C  (Prc(l'm,  =  0)  =  0,5  A  Prc(l'cu,  =  1)  =  0.5) 

Recall  also  that  subjects  are  assumed  to  know  the  system  description.  Thus, 

.  =  {KLa  (Prc(l'„,  =  0)  =  0.5  A  Prc(tM  =  1)  =  0.5)} 

The  only  bL  €  0[L\  are  0  and  1;  hence,  the  only  antecedents  for  the  SVC  schema  that  are 
consistent  with  ,  are  Prc{L'out  =  0)  and  Prc{L'0Ut  =  1).  Thus,  SVC  with  respect  to  L  for 
this  system  consists  of  the  following  two  formulae. 

a(prc(L’cut  =  0)  =  0.5  -+  KL(Prc(L'ml  =  0)  =  0.5)) 

0(Prc(L'cut  =  1)  =  0,5  ->  KL{Prc(L'mt  =  1)  =  0,5)) 

Each  of  these  is  derivable  from  ,  using  propositional  reasoning,  Modus  Ponens  and  Necessi- 
tation  and,  axioms  K,  4,  and  KD.  Further,  since  SVC  is  stronger  than  FSC,  such  a  proof  is 
sufficient  to  show  this  system  satisfies  FSC.  (In  the  typical  case  one  would  proceed  through 
SVC  to  prove  FSC,  as  we  have  done.  As  noted  above,  however,  for  special  cases  such  as  this 
it  is  equally  easy  to  derive  FSC  directly.) 

We  also  observe  that  for  the  insecure  encryption  box  of  Example  3.2  ,  I /  FSC  (where  , 
includes  those  formulae  that  describe  the  system  as  well  as  the  assumptions  about  knowledge 
thereof).  It  is  obvious  that  the  insecure  encryption  box  fails  to  satisfy  PNI.  By  the  attack 
described  in  the  original  example,  we  can  easily  find  two  adversaries  that  satisfy  the  Secure 
Environment  Assumption  and  agree  on  low  behavior;  yet,  disagree  on  the  probability  of 
certain  low  events.  Indeed,  the  low  environment  can  assign  0/1  probabilities  to  any  output 
sent  by  the  high  part  of  the  adversary.  By  theorem  6.9,  we  thus  have  that  ,  | /=  FSC.  And, 
by  soundness  (theorem  5.1),  it  follows  that  ,  1/  FSC . 

8  Conclusions 

We  have  given  a  logic  for  specifying  and  reasoning  about  the  multilevel  security  of  proba¬ 
bilistic  computer  systems.  We  have  established  connections  between  information-theoretic 
formulations  of  security  and  logical  formulations  of  knowledge  and  probability  in  distributed 
systems. 
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To  date,  we  have  only  been  able  to  specify  and  verify  toy  systems  using  our  logic.  Our 
SVC  takes  one  small  step  towards  practically  verifiable  security.  However,  it  is  unlikely  that 
one  could  ever  use  FSC,  or  even  SVC,  for  verifying  real  systems  since  real  multilevel-secure 
systems  (e.g.,  as  in  Ivarger  et  al.  [23])  are  too  complex  to  be  completely  free  of  covert  channels, 
even  at  the  specification  level  (e.g.,  as  in  Browne  [5]).  Therefore,  they  cannot  satisfy  our  ideal 
notions  of  security.  Nevertheless,  we  feel  it  is  important  to  cast  ideal  security  in  a  precise 
logical  framework.  It  is  our  hope  that  extensions  of  this  work — using  less  ideal  notions  of 
security  allowing  some  limited  information  flow — will  ultimately  lead  to  machine  checkable 
proofs  of  security  for  real  systems. 
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